t 

Of^T* WORLD INTELLECTUAL PROPERTY ORGANIZATION 

' A ^— ' International Bureau 



INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) Internationa) Patent Classification 6 : 

C12N 15/85, 15/13, C07K 16/28, C12N 
15/43, 15/28, C07K 14/525, C12N 15/26, 
C07K 14/55, C12N 15/19, C07K 14/52, 
C12N 5/10, 15/62, C07K 19/00, G01N 
33/53, 33/60, C12Q 1/68 


Al 


(11) International Publication Number: WO 98/11241 
(43) International Publication Date: 19 March 1998 (19.03.98) 


(21) International Application Number: PCT/EP97/04765 

(22) International Filing Date: 2 September 1997 (02.09.97) 

(30) Priority Data: 

961 14820.2 16 September 1996 (16.09.96) EP 
(34) Countries for which the regional or 

international application was filed: GB et al. 

961 15635.3 30 September 1996 (30.09.96) EP 
(34) Countries for which the regional or 

international application was filed: GB et al. 

(71) Applicant (for all designated States except US): MERCK 

PATENT GMBH [DE/DE]; Frankfurter Strasse 250, D- 
64293 Darmstadt (DE). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): VON HOEGEN. Ilka 
[DE/BE); Route de Renipont 25a, B-I380 Ohain (BE). 
BURGE, Christa [DE/DEJ; Carsonweg 23, D-64289 Darm- 
stadt (DE). BROMMER. Wolfgang [DE/DE]; A, Grenzweg 
9, D-64665 Alsbach (DE). DUNKER, Reinhard [DE/DE]; 
Am Wemsbach 9, D-64354 Reinheim (DE). RIEKE, Erwin 


[DE/DEJ; Hermannstrasse 12, D-64342 Seeheim-Jugenheim 
(DE). WELGE, Thomas [DE/DE]; Am Steinernen Kreuz 
58a, D-64297 Darmstadt (DE). HAUSER, Hansjorg 
[DE/DE]; Mascharoder Weg l r D-38124 Braunschweig 
(DE). MIELKE, Christian [DE/DE]; Mascharoder Weg 1, 
D-38124 Braunschweig (DE). 

(74) Common Representative: MERCK PATENT GMBH; Post- 
fach, D-64271 Darmstadt (DE). 

(81) Designated States: AU, CA, JP, US, European patent (AT, BE, 
CH. DE, DK, ES. FI. FR, GB, GR. IE, IT, LU, MC, NL, 

PT, SE). 

Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: OLIGOCISTRONIC EXPRESSION SYSTEM FOR THE PRODUCTION OF HETEROMERIC PROTEINS 




(57) Abstract 
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OLIGOCISTRONIC EXPRESSION SYSTEM FOR THE PRODUCTION OF HETEROMERTC PRO- 
TEINS 

The present invention relates to a mammalian expression system for the 
production of recombinant heteromeric proteins, preferably antibodies and 
antibody fusion proteins such as antibody-cytokine fusion proteins and 
fragments thereof, by means of tri- or oligocistronic expression vectors 
which are under the control of a strong promoter/enhancer unit and which 
contain a selection marker as one of the cistrons. This selection marker 
guarantees together with at least two IRES elements a robust and stable 
production of the heteromeric proteins in excellent yields. 



Background of the invention 

For the expression of herteromeric proteins in mammalian cells such as antibody 

15 

molecules traditionally two vectors have been used which frequently leads to 
unpredictable overexpression of one of the protein chains in comparison with the 
second one. Where one chain is relatively overexpressed the cells begin to suffer 
resulting in instability of production and/or in purification problems (e.g. light 
chain dimers). One traditional way to overcome this problem is to cotransfer the 

20 

vectors in a well defined ratio into the host cells. This requires that the plasmid 
copies are accepted and integrated simultaneously and stable, and that the plasmid 
ratio remains constant during cell division. Only for a few systems satisfying 
results were obtained up to now. 

25 

Another traditional way is to use independent transcription units located on one 
plasmid. Thus, the different genes are present on the vector in a correct ratio. 
Provided that promoters of comparable strength are used equal amounts of the 
desired protein chains should be obtained. However, different stability and 
^ translation efficiencies of the mRNAs which are coding for the different proteins, 
and different transcription efficiencies of the genes lead to an unequal synthesis of 
the desired protein chains. 
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To avoid these probiems di- and multicislronic ^ were 
such systems the gene unit, used (coding for , hc ^ ^ ' 
under the conUol of one single p _ ^ ^ ^ ^ ^ 

the terminus is BmtM fa ^ ^ ^ ^ 

transtaon occur, according ,o the "cap"- dependent mechanism. Th e following 
~ are ,rans,a,cd insufficient,, or not at a|| „ ^ ^ ^ 
Ration of the fo„owi„g cistrons in mu.ticistronic systems can be i nj , iattd md 
pushed by using sequences having no "cap" structure. Such sequences are 
obtatnabie from „o„-,ra„s,a,ed sections of some viruses, such as pohovirus a„ d 
encophalomyocarditis virus (Jang „ al „ , 988> , vjrol ^ ^ ^ ^ 

y Viroi. 63: ,65,; Pe„e,ier und Son^cbcrg, ,988. Nature 334:320). Within the 
vtrus sequences a short section which is no, traniated and ca„ed ,RES (interna, 
r^osoma, entry site) can be used to aHow trans.ationa, rem&lion ^ 
■he cap. Such sequences have to be intercperscd between the cistrons to make a 
muh.c.stronic mRNA (unctiona,. ,RES sequences do no. innuence the "cap" 
dependent translation of the firs, cistron. However, i, was found tha, the "cap- 
dependent transition i, as a ru,e, more etTecive ,han the .RES-dependen, 
.rans, a ,.„„ which means tha, the proteins are expressed in a non-stoichiometric 
ra..o and, fi„ al , y , lea ds ,o a ,oss of stabiiity. Thus, it ,s very d.ffieuh to produce 
•wo or more proteins in eq uimo,ar rat.os even with means of . hi- or ohgocstronic 

expression unit. Bisc.stronic expression systems a„ d vectors, respective*, us.ng 
non-anti^, genes „ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

these systems a gene coding f or . selection marker was used as second cistron 
imernattona, patent pubiication WO 94/05785 discloses a genera, teaching of 
expression units in which more than one ,RES Cement can be theoretically 
mserted into the vector construction. ,„ detai,, on,y a bieistronic 

express™ system is described using wel, deftned genes. namCy encoding PD0F 
chants A and B (p, a ,e,et derived growth factor) separated by an IRES COMaining 
unit. No selection marker is used in this system. 
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It has not been reported until now that heteomeric proteins such as antibody heavy 
and light chains have been expressed in stoichiometric and stable formation by tri- 
or oligocistronic systems. It has not been reported, furthermore, that the use of a 
5 selction marker as one of the cistrons leads to transformed cells which have an 

extraordinaryly high stability. 

Equimolar and stable production of the heteromeric protein chains, such as the 
heavy and light chain of antibodies, is a prerequisite for a correct association and 
10 folding of the two chains, and, therefore, for a correct steric conformation which is 
important in order to achieve an optimal biological activity of the associated 
heteromeric protein or peptide chains. 

In the case of an antibody fusion protein, the biologically active ligand for an 

15 antibody-directed targeting should induce the destruction of the target cell 
either directly or through creating an environment lethal to the target cell. 
The biologically active ligand can be a cytokine such as IL-1, IL-2, IL-4, IL- 
6, IL-7, IL-10, IL-13, IFNs, TNFa or CSFs. These cytokines have been 
shown to elicit anti-tumor effects either directly or by activating host defense 

20 mechanisms (e.g. Mire-Sluis, TIBITECH, 11:74). For instance, IL-2 is 
considered the central mediator of the immune response. IL-2 has been 
shown to stimulate the proliferation of T- cells and NK-cells and to induce 
lymphokine-activated killer cells (LAK). IL-2 enhances the cytotoxicity of 
T-cells and monocytes. TNF alpha has found a wide application in tumor 

25 therapy, mainly due to its direct cytotoxicity for certain tumor cells and the 
induction hemorrhagic regression of tumors. In addition TNF alpha 
potentiates the immune response: it is a costimulant of T-cell proliferation, it 
induces expression of MHC class I and II antigens and TNF alpha, IFN and 
IL-1 secretion by macrophages. However, most of the known cytokines 

30 activate effector cells, but show no or only weak chemotactic activity. 
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Chernovs, however, are chemotactic for many effector cells and enhance 
.he,r presence a, the tumor site and induce a varie,y of effector eel, functions 
(e.g. MiHer and Krangel, ,992, "Biology and Biochemistry of the 
Chemokines,...", Crhica, Reviews i„ Im munology ,2:,7). Examp.es for 
su.table chemokines according ,„ the invention are ,L-8 and MIP 2a and 
M,P 2B whtch are members of the C-X-C chemokine superfamily (also 
known as small cytokine superfamily or intecrines). 

Epidermal grow* factor (EOF) is a polypeptide hormone which is mitogenic 
for e pl derma, epitheiial cells. When EOF interacts with sensitive cells i, 
b,nds ,„ membrane receptors (EOFR). The EOFR is a trans-membrane 
glycoprotein of about ,70 kD, and is a gene product of the c-erb-B proto- 

oncogene. 

The murine monoclonal antibody mAb425 was raised against (he human 
A431 carcinoma cell line (ATCC CRL ,555 : US 5,470,571) and was found 
to b,nd to a polypeptide epitope on the external domaine of the EGFR. It was 

found to inhibit the binding of EGF and t n . . 

ig or toi- and to mediate tumor cytotoxicity in 

vitro and to suppress tumor cell orr»wfh ~r j 

ceii growth of epidermal and colorectal 

carcinoma-derived cell lines in vit™ <\>„a i 

cn lines in vitro (Rodeck at al., 1987, Cancer Res 

47:3692). 

Humanized and chimeric version of mAb 425 are known from WO 92/15683 
Fusion proteins of mAb425 (as a whole or fragments thereof) and cytokines 
or chemokines are described in European patent publications EP 0659 439 
and EP 0706 799. 
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Summary of the invention 



Thus, it is an object of the present invention to provide an expression system 
5 suitable for the stable production of a heteromeric protein, preferably an antibody, 

and more preferably an antibody fusion protein, which avoids the problems of the 
prior art systems as described above. 



It has been found as a result of this invention that a proper expression of these 
10 heteromeric proteins can be achieved by using oligocistronic expression units 
comprising at least two IRES elements where the different heteromeric chains, e.g 
the heavy and light protein chain of an antibody, are cotranslated from one mRNA 
molecule comprising a sequence encoding a selection marker. The strength of the 
effect caused by the selection marker in this system is surprising and could not be 
15 expected compared with usual expression systems of the prior art. The effect is 
especially strong when the gene encoding the selection marker is located at the end 
of all cistrons each separated by IRES units. This is not the case if the selection 
pressure is removed or if the selection marker is used in traditional expression 
vectors. Using the selection marker as last cistron forces the cell to produce the 
20 linked protein / proteins. 



The constructs according to the invention allow equimolar production of the 
heteromeric protein chains and guarantee selection and stable, long-term 
expression of the optimal production clones by concomittant expression of the 
selection marker, because only those clone will grow under selection pressure 
which express the entire cistronic expression unit. 



It has been found that the combination of a selection marker gene and an 
IRES sequence located behind a bicistronic unit (to form a tricistronic unit) 
comprising the sequence coding for the light chain of an antibody, an IRES 
sequence and a sequence coding for a fusion protein consisting of the heavy 



BNSDOCID: <WO 981 1 241 A1_l_> 



WO 98/11241 



-6- 



PCT/EP97/04765 



chain of an antibody fitsed ,o another biological* active protein, such as a 
cytokme or chcmokine> is W[y >dnnaguiB ^ ^ ^ ^ ^ 

expression in excellent yields. 

5 

It is an objective of the present invention to provide a new expression system 
for eucaryotic cel.s which ensures a stable, reproduce and robust 
production process f„, recombinant single and muhi-chain protein 
complexes such as antibodies or, especially, antibody-cytokine fusion 

10 proteins. 

The present invention relates to a mamm.iion ^ 

io a mammalian expression system for the 

production of heteromeric proteins, prefetaWy recombinant antibodies and 
more preferably antibody fnsion proteins such as antibody-cytokine fusion 

1 5 proteins and fragments thereof. 

The invention relates, preferably, to such a expression system which is able 
to produce antibody fusion proteins or fragments thereof, wherein the 
antibody binding sites are directed to the human EGF-receptor and the 
antibody is covalently .inked to a biolog lca lly active ligand such as a growth 
and/or d.fferentiation factor, above all TNF alpha, or IL-2. The invention 
discloses a set of vectors which comprise oligoastronic, prefcrab.y tri- and 
tctraczstronic expression units driven by a single strong promoter hybrid 
Jinked to genes encoding protein chains of the light chain, the heavy chain 
and the active ligand and, additionally a selection marker in the promoter- 
d-stal posinon. Cotranslation of these proteins from one oligocistronic 

mRNA guarantees strict coupling of exnre^ion u 

F'uig oi expression and allows stoichiometric 

production of protein chains. 



20 



25 



30 
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Therefore, it is an object of the invention to provide an oligocistronic expression 
vector suitable for the production of a heteromeric protein consisting of at least 
two protein chains in a mammalian host cell comprising 
5 (i) a promoter / enhancer sequence, 

(ii) a sequence encoding a first chain of the heteromeric protein or a fragment 
thereof, 

(iii) a sequence encoding a second chain of the heteromeric protein or a 
fragment thereof, 

10 (iv) optionally a sequence encoding a third or further chain of the 

heteromeric protein or a fragment thereof, 

(v) a sequence encoding a selection marker, and 

(vi) at least two sequences comprising a 5'-UTR poliovirus sequence 
containing an IRES element. 

15 

It has been found now that the order of the genes located in the vector construct is 
important with respect to the described advantageous effects. Thus, especially, the 
gene coding for the selection marker should be located as last cistron within the 
vector construct. Additionally, in the case of an antibody, the gene encoding the 
20 light chain of the antibody should be located in upstream position before the gene 
coding for the heavy chain. 

Therefore, it is a preferred object of the invention to provide said expression 
vector, wherein the sequences (i) to (vi) are in the following order from upstream 
25 to downstream progression of said vector construct: 

(1 ) a sequence comprising the promoter / enhancer sequence (i), 

(2) a sequence comprising the sequence encoding a first chain of the heteromeric 
protein or a fragment thereof (ii), 

(3) a sequence (vi) comprising a first IRES element, 

30 (4) a sequence comprising the sequence encoding a second chain of the 
heteromeric protein or a fragment thereof (iii), 
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(5) a sequence (vi) comprising a second IRES element 

(6) optionally a sequence comprising the sequence encoding a third or further 
Cn of the heteromeric protein or a fragment thereof (iv), a sequence comprising 
a third or further IRES element (vi) included, 

(7) a sequence comprising the selection marker (v). 

The advantage of this system is also shown in Fig. ,7 and ,8. Under seIection 

heteromenc protein but without section pressure or "wrong" position of the 
selects marker the stable productivity is ra P id,y lost. The greatest advantage of 
the system 1S that (heteromeric) proteins can be expressed which can be tox.c to 
the host cells like proteases, glutamate receptor subtypes and serotonin receptor 
subtypes or antibody fusion proteins wherein the non-antibody partner is normally 
highly toxic for the host cells. 

Preferab.y, a copending expression system is object of (he invention, wherein 
seance (ii, encodes the Hgh, chain and the science (iii, comprises a 
sequence ending the heavy cha.n of a monebna. antibody or a fragment thereof 
However, the teaching of ,h,s invention is aiso app.icab.e for heteromeric proteins 
other than antibodies, for heteromeric proteins having more than two chain, and 
even normal (one-chain) proteins having toxic activity against the host cel, and 
finaily. heteromeric proteins (e.g. antibody fusion proteins, having strong toxic' 
act,v,ty caused by a part of said heteromeric protein.. 

Furthermore, a corresponding expression system is objec, of the invention 
wheretn the sequence (iii) consists of two sequences (iiia. iiib). wherein (iiia)' 
encodes the heavy chain of an antibody or a fragment thereof and (iiib, encodes a 
b.olog.ca„y active ligand, such as a cytokine or a chemoKine or a fragment thereof 
in order to form a fusion protein. 
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It has been found, additionally, that such expression vector constructs are 
preferred, and therefore, object of the invention, wherein the sequence of (iiia) is 
shortened at its C- terminus and the sequence (iiib) at its N-terminus each by 1 to 
5 1 5 amino acids. 

A special and preferred embodiment of the invention is a tricistronic expression 
vector as defined above and in the claims, wherein the sequence (iiia) and the 
sequence (iiib) are linked directly in order to encode a fusion protein. 

10 

In addition the expression vector according to the invention may, optionally, 
contain eucaryotic sequence elements such as SAR/MAR elements to further 
increase production and stability of the system. The expression of certain genes has 
been reported to respond positively to butyrate. The stimulatory effect of butyrate 

15 is largest if one or two scaffold/matrix-attached regions (SAR/MAR elements) are 
present adjacent to the gene (Schlacke et al., 1994, Biochemistry 33:4197). Only 
after integration of the constructs in to the genome of the host cell these regions 
increase the expression of adjacent genes in an orientation- and position- 
independent fashion. Gene activation causes the apparent loss of nucleosome 

20 structure ahead of the S AR element and a similar change has been demonstrated by 
the action of butyrate. Presence of both SARs and butyrate act synergistically in 
enhancing gene expression (Klehr et al. 1992, Biochemistry 31 :3223). 

Therefore, an expression vector defined above and in the claims is object of the 
25 invention, comprising, additionally, one or two, preferably two, SAR elements. 
Preferably, one SAR element is located in front of the promoter/enhancer region 
the second one behind the sequence encoding the selection marker. However, other 
locations are also possible. 



30 



Preferably, the invention relates to antibody fusion proteins, wherein the non- 
antibody protein is a biologically active protein. Preferably, such expression 
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vectors are object of the invention, wherein a sequence (iiib) is used which 
encodes a cytokine or chemokine such as TNF alpha, IL-2 and IL-8. 

Above all, such expression vectors are object of the invention, wherein the 
sequences (ii) and (iii) comprise sequences coding for the light and heavy chain of 
a monoclonal anti-EGFR antibody, preferably, humanized monoclonal antibody 
425 (mAb425) or fragments thereof. However, the invention is not restricted to 
anti-EGFR antibody or mAb425, respectively, but includes also any other 
monoclonal antibodies directed to a variety of specificities, for example mAb361 . 

As an especially preferred embodiment it is object of the invention to provide an 
expression vector comprising the following units in the given order: the 
CMV/MPSV promoter/enhancer sequence followed by the sequence encoding the 
mAb425 light chain, followed by the sequence from 5'-UTR poliovirus containing 
an IRES element, followed by a fusion gene encoding a fusion protein consisting 
of the heavy chain of humanized mAb425 and fused at its C-terminus the sequence 
encoding TNF alpha or IL-2, followed by another IRES element from 5'- UTR 
poliovirus, followed by a sequence coding for puromycin acetyl transferase as 
selection marker and, finally a nucleotide sequence derived from the 
polyadenylaiion signal of SV40. 

Furthermore, the well-defined expression vector comprising the nucleotide and 
amino acid sequences depicted in Figure 15 is object of this invention. 

Additionally, it is an object of the invention to provide an expression system 
comprising a mammalian host cell transformed with an expression vector specified 
above and in the claims, preferably, wherein the host cell is CHO or BHK. 

Finally, it is an object of this invention to make available a process for the 
production of a heteromeric protein, preferably an antibody, especially an antibody 
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fusion protein, especially a mAb425/TNF alpha or mAb425/IL-2 antibody fusion 
protein, or fragments thereof, by cultivating the host cells of an expression system 
as specified above and in the claims in a suitable nutrient and separating, if a 
5 tricistronic vector is used, the complete and active antibody fusion protein from the 

cells and / or the medium. 

Brief Descriptions of the Figures 

10 Fig. 1 (a-e): 

Expression plasmids for the generation of tricistronic expression vectors. 
AmpR= Ampicillin resistance gene; IRES = Poliovirus derived internal 
ribosomal entry site; MPSV = Promoter/Enhancer; CMV = Cytomegalo 
virus promoter; Puromycin R = Puromycin resistance gene; SV 40 pA = SV 
15 40 polyadenylation site. 

Fig. 2: 

Stability of BHK-21 mAb425CHl clones. Stability of three different clones 
was determined over the time period indicated- The production of 
20 mAb425CHI fusion protein of 10^ cells/ml per 24 hrs was determined in an 
anti-Ig based ELISA. Cells were cultured in medium with (+P) or without (- 
P) Puromycin. 

Fig. 3: 

25 Stability of a BHK21 mAb425CHl TNFa clone. Cells were cultured in 
DMEM medium for 89 days without selection pressure. The production of 
mAb425CHl fusion protein of 10^ cells/ml per 24 hrs was determined in an 
anti-Ig based ELISA. 

30 
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Fig. 4: 

Stability of a BHK2I mAb425CH3-.L-2 clone. Ce.ls we re cuitured in 
DMEM medium for 48 days without xlcction pKSSUK ^ ^ 

mAb425CHl fusion protein of ,0° cells/ml per 24 hrs was determined in an 
anti-Ig based ELISA. 

Fig. 5: 

SDS PAGE of purified mAb425CH3-IL-2. Lane 1 : mAb425CH3-IL-> Lane 
2: IgGl control antibody. Proteins were run on a 4 to 15% gradient gel 
(Phast System, Pharmacia) and stained with Coomassie. 

Fig. 6: 

FACS analysis of purified mAb425CH3-IL-2. The human EGF-R-positive 
carcinoma cell h„ c A431 was incubated with the indicated antibody 
concentrations. Two different preparations of purified mAb425CH3-IL-2 
were compared with purified mAb425 reference antibody. 

Fig. 7: 

Determination of IL-2 activity of purified mAb425CH3-IL-2 IL-2 
dependent mouse CTLL2 cells were incubated with mAb425CH3-IL-2 or 
rcc. human IL-2 (WHO Standard). Concentrations of fusion protein are 
shown as mAb425 equivalents, determmed by an ELISA based on an anti- 
idiotype antibody specific for mAb425. 5xl 0 4 were culturcd for 2 days ^ 
pulsed with 0,5 uCi 3 H -Thymidine 18 hrs before harvesting. 

Fig. 8: 

PMCLDHAP tricistronic vector for the expression of mAb425CH3-TNFa 
AmpR— Ampicillin resistance gene; IRES = Poliovirus derived internal 
nbosomal entry site,MPSV = Promoter/Enhancer; CMV = Cytomegalo 



BNSDOCID: <WO 9811241A1J_> 



WO 98/11241 



- 13 - 



PCT/EP97/04765 



virus promoter; Puromycin R = Puromycin resistance gene; SV 40 pA = SV 
40 polyadenylation site. 

5 Fig. 9: 

Stability of a BHK21 mAb425CH3-TNFa clone. Cells were cultured in 
DMEM medium for 48 days without selection pressure. The production of 
mAb425CHl fusion protein of 10 6 cells/ml per 24 hrs was determined in an 
anti-Ig based ELISA. 

10 

Fig. 10: 

Integrity of expression vector DNA in the absence of selective pressure. 
BHK-21 cell clones transfectcd with pMCLDHAP and expressing 
mAb425CH3-TNFa fusion protein were either cultivated under puromycin 

1 5 pressure (+) or grown in the absence of puromycin (-) for the indicated times. 

Graph A shows antibody fusion protein secretion (ng IgG/ml x 24 hr). B is a 
Southern blot of chromosomal DNA prepared from cells which were taken at 
the indicated times. The DNA was restricted with PstI and hybridized with a 
labelled Pstl fragment from pMCLDHAP (1231 bp) encompassing part of 

20 the heavy chain fusion protein encoding cDNA (he), mbhl represents a 
single copy DNA fragment (1900 bp) of a hamster c-myc gene which was 
cohybridized using a specific probe (see example 7). Since both probes are 
labelled with the same specific activity and their length is similar, the 
intensity of the he band corresponds to the copy number of the integrated 

-5 expression plasmid. 



Fig. 11: 

FACS analysis of purified mAb425CH3-TNFa. The human EGF-R-positive 
carcinoma cell line A431 was incubated with the indicated antibody 
concentrations. Purified mAb425CH3-TNFa was compared with purified 
humanized mAb425 reference antibody. 
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Fig. 12: 

Determination of TNFa activity of purified mAb425CH3-TNFa on MCF7 
cells. The TNFa-sensitive and EGF-R negative human breast 
adenocarcinoma cell line MCF7 was used to determine the TNFa activity of 
the mAb425CH3-TNFa fusion protein. Concentrations of fusion protein are 
shown as mAb425 equivalents, determined by an ELISA based on an anti- 
idiotypic antibody specific for mAb425. humanized mAb425 and rTNFa are 
mixed at a ratio of 6:1 reflecting the molecular ration of both parts in the 
fusion protein. 5x1 04 were cultured for 4 days and pulsed with 0,5 uCi 3 H - 
Thymidine 1 8 hrs before harvesting. 

Fig. 13: 

TNFa mediated cytotoxicity of purified mAb425CII3-TNFa is dependent 
on TNFa sensitivity. The TNFa-resistant and EGF-R-positive human 
carcinoma cell line A431 was used to determine the specificity of the 
mAb425CH3-TNFa fusion protein. Concentrations of fusion protein are 
shown as mAb425 equivalents, determined by an ELISA based on an anti- 
idiotype antibody specific for mAb425. Humanized mAb425 and 
rTNFa are mixed at a ratio of 6:1 reflecting the molecular ratio of both parts 
in the fusion protein. 5x10* were cultured for 4 days and pulsed with 0,5 uCi 
3 H-Thymidine 18 hrs before harvesting. 

Fig. 14: 

mAb425CH3-TNFa is highly cytotoxic for EGF-R-positive and TNFa- 
sensitive human tumor cell lines. The human mamma carcinoma cell lines 
BT20 and the human melanoma cell line C8161 are both TNFa-sensitive 
and EGF-R-positive. Concentrations of fusion protein are shown as mAb425 
equivalents, determined by an ELISA based on an anti-idiotypic antibody 
specific for mAb425. mAb425 and r TNFa are mixed at a ratio of 6:1 
reflecting the molecular ratio of both parts in the fusion protein. 5x10^ were 
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cultured for 4 days and pulsed with 0,5 jiCi 3 H-Thymidine 18 hrs before 
harvesting. 

Fig. 15: 

Complete nucleotid and amino acid sequence (coding regions) of 
mAb425CH3-TNFa as shown in Fig. 8. 

Fig. 16: 

Hy story of relevant vectors of the invention. 
Fig. 17: 

Stability of different antibody fusion protein cell clones (rBHK21mAb425- 
CHML2). A = mAb425; stability of 3 different clones is tested. The 
production of fusion protein of 10 6 cells / ml in 24 h is determined in the 
ELISA detecting the antibody part. Cells are cultured for the indicated days 
in medium with (+P) or without (-P) Puromycin. 

Fig. 18: 

Stability of the cell clone rBHK21mAb425-CH3-IL2698-8 with (CHO-M + 
P) and without (CHO-M - P) selection pressure (puromycin). The stability is 
tested for 70 days in culture. The production of protein of 106 cells / ml in 24 
h is determined in an ELISA detecting the antibody part of the protein. 



WO 98/11241 



- 16- 



PCT/EP97/04765 



10 



15 



20 



25 



30 



Detailed IW rr j rT j Pn 

Above and below the term "heteromeric protein" mean, a p rolei „ which 
natuarally consists of ,wo or more chain,. Only if the corresponding chains 
are associated and folded correctly the full biological activity of the 

heteromeric protein can be obtained. 

Above and below the term •■mAb425CHl-» means an antibody consttuction 
containing the light chain, the variable region of the heavy chain, and the 
CHI domain of the constant region of mAb425. 

Above and below the term "mAb425CH2-» means an antibody construction 
containing the light chain, the variable region of the heavy chain, and the 
CHI and CH2 domain of the constant region of mAb425. 
Above and below the term "mAb425CH3-« means an antibody construction 
containing the light chain, the variable region of the heavy chain, and the 
CHI, CH2 and CH3 domain of the constant region of mAb425. This 
construct corresponds to the complete antibody. 

Above and below, the term "a sequence encoding - does not mean 

exclusively the specific coding sequence, but may inc.ude also a sequence 
comprising said specific coding sequence, provided that no other statement is 

made. 

Said add.tional sequences indicated above and coding for proteins fii, iii (iiia, lib) 
■v, v.] can be prolonged or shortend each by 1 to 20 amino acids provided that the 
specif* biological properties are not substantially amended. Prolongation can be 
caused, for example, by Hnker or leader peptides. Furthermore, the expression 
vector constructs according to the invention may contain introns which are no, 
•ranlated into amino acids. Prolongations and deletions of coding regions may 
occur, preferably, a, the C- and / or N-terminus of the corresponding specific 
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peptide or protein. Preferred deletions according to the invention may occur at the 
C-terminus of the heavy chain of the antibody and the N-terrninus of the biological 
ligand. 

5 

Furthermore, the invention includes also mutations and varients of the 
sequences indicated in detail having the same or a very similar biological 
activity. Such mutations and varients can be produced by accident (e.g. 
spontaneous mutations, natural radiation) or by intended chemical or 
10 physical activities. 

The term "antibody fragment" means according to the invention an antibody 
fragment as defined above (mAb-CHl, mAb-CH2) as well as complete 
antibody (mAb-CH3) which is shortend by 1 to 20 amino acids at the C- 
1 5 terminus of its constant region. 

The term "biological active ligand" means according to the invention any 
protein or peptide ligand which is effective against a target cell, above all, 
against a target cell which is recognized by the antibody part of the antibody 
20 fusion protein. The effect of the biological ligand may be, for instance, a 
toxic and/ or lysing and / or inhibiting one against the target cell, preferably 
a tumor cell. Examples of suitable biological active ligands arc given above. 

The term "biological activ ligand fragment" means according to the present 
25 invention a biological ligand (cytokines, chemokines) which is usually 
shortened by 1 to 20 amino acids at its N-terminus which is connected 
directly, or optionally via a linker peptide, to the (optionally shortened) C- 
terminus of the constant region of the antibody heavy chain. 

30 All microorganisms, cell lines, plasmids, promoters, resistance markers, 
replication origins, restriction sites or other fragments or parts of vectors 
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wh.ch are mentioned in the descrip, io „ no, direc ,|y in connection ^ ^ 
cla.med invention are commercially or otherwise generally available 
Provided «ha, no other hints are given, they are used „„,y as examples and 
are no, essential with respect ,„ the invention, and can be replaced by other 
suitable tools and biological materials, respectively. 

The techniques which are essentia! according to the invention are described 
m detail below and above. Outer techniques which are no, described in detai! 
correspond to known standard methods which are we,, known to a person 
sk.l.ed in the an, or are described m0 r in detail in ,he cited references and 
paten, applications and in the standard literature (e.g. Sombrook et al 1 989 
Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold Spring Harbor' 
Harlow, Lane, ,988, Antibodies: A Laboratory Manual, Co,d Spring 
10 Harbor). 

The selection marker according to the invention can be ,n principal any 
known selection marker suitable for high expression systems. Examples are 
enzymes such as puromycin-ace,y, transferase or neomycin 
phospho,ransferase. Puromyci„-ace,yl .ransferase is preferred according ,o 

this invention. 

Alternatively, dominant acting genetic markers useful for monitoring gene 
transfer in mammalian cells ,ha, are based on pro caryo,ic genes encoding 
key steps in the synthesis of the essential amino acids, such as tryptophane or 
rusbdme can be used. Under appropriate conditions, expression of these 
genes obviates the nutritional requirement for ,heir respective amino acid 
product. Expression of ,he 6 subuni, of tryptophan synthase (trpB EC 
4.2.1.20) of Escherichia coli allows mammalian cell survival and 
multiplication in medium containing indole in place of tryptophane The 
h,sD gene of Salmonella typhimurium encodes histidinol dehydrogenase (EC 
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1.1.1.23), which catalyses the two-step NAD+-dependent oxidation of L- 
histidinol to L-histidine. In medium lacking histidine and containing 
histidinol only mammalian cells expressing the hisD gene survive. Use of 
5 these markers is advantageous over the use of antibiotics because for either 

trp or his selection the substitute nutrients indole or histidinol are readily 
available, inexpensive, stable, permeable to cells and convertible to the end 
product in a step controlled by one gene (Bode et al. 1995, Int. Rev. Cytol., 
R. Berezney & K.W. Jeon eds. Academic Press, Vol 162A:389) 

10 

As IRES sequences all sequences deriving from viral, synthetic origin or 
from cells can be used which allow an internal binding of ribosomes. 
Examples for such sequences are the 5'-UTRs elements from poliovirus type 
1, 2 or 3 (picorna virus), from "encephalomyocarditis virus" (EMCV) 
15 (Sugimoto et al., 1994, BioTechnol. 12:694), from "Theilers murine 
encephalomyelitis virus" (TMEV), from "foot and mouth disease virus" 
(FMDV), from "bovine enterovirus" (BEV), and from "coxsackie B virus" 
(CBV). 

20 The tri- or oligocistronic expression vector according to the invention works 
with a single strong promoter/enhancer unit. Examples for suitable 
promoters/enhancers are: CMV (Boshart et al., 1985, Cell 41:521); MPSV- 
LTR (Laker et al.,1987, Proc. Natl. Acad. Sci. USA 74,:8458); MPSV-CMV; 
RSV (Gorman et al., 1982, Proc. Natl. Acad. Sci. USA 79:6777); SV40 

25 (Artelt et al., 1988, Gene 128: 247). The system MPSV(enhancer)- 
CMV(promoter of the cytomegalic virus) is the preferred unit according to 
the invention. 

The fusion protein described in the examples contains a monoclonal 
30 antibody with specificity for the human EGF-receptor(EGFR). The 
monoclonal mAb425 was raised against the human A431 carcinoma cell line 
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and found to bind to a polypeptide epitope on the external domain of the 
EGFR. The heavy chain mAb425 antibody was fused C-tenninally to 
cytok.nes/chemokines such as IL-2, IL-4, IL-7, TNFct and IL-8 as 
biologically active ligands. The constructs encodin 8 these immuno . 
conjugates were generated with recombinant DNA technologies. As pointed 
out above, the immuno-conjugates contain the variable region of the 
antibody heavy chain and the CHI domain of the constant region (antibody- 
CHI conjugates), or the CHI and CH2 domain of the constant region 
(ant.body.CH2 conjugates) or the CHI, CH2 and CH3 domain of the 
constant region (antibody-CH3 conjugates) fused to the biologically active 
hgand. By addition of the appropriate light chain immunoconjugates can be 
generated which target antigen-bearing cells and deliver an active ligand to 
to a specific site in the body. The C-terminal amino acid sequence of the 
junctional region of CHI and CH3 fusion proteins is not involved in any 
secondary structure elements according to the hypothetical computer model 
In these regions several putative sites for proteolytic cleavage are present In 
order to retain/increase chemical and biological stability these sequences can 
be shortened up to a limit where the biological activity of the ligand is lost 
N-terminal cytokine sequences are frequent.y involved in receptor binding 
and b.olog.cal activity, e.g. in human TNFa amino acid sequences between 
portions 1, and 35 appear to be critical f or receptor binding and triggering 
of bmlogical responses (Goh & Porter, Prot. Eng. 4:385, 1991) In those 
cases where loss of activity is caused by inaccessibility of relevant amino 
acds due to interference of the antibody part linker sequences can be 
introduced which consist of repetitive un.ts containing amino acids which do 
not mterfere with chemical stability and biological activity, e.g. see Curtis et 
al. Proc. Natl. Acad. Sci. USA, 88:5809, 1991. 



30 
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In a preferred embodiment according to the invention a system of expression 
vectors is provided, which allows easy generation of expression vectors for 
synthesis of three proteins from a tricistronic expression unit. 

5 

In a preferred embodiment according to the invention tricistronic vectors 
have been constructed in which IgG light chain, heavy chain-cytokine fusion 
protein and a selectable marker are translated from one mRNA. Sequences of 
translation reinitiation elements (internal ribosomal entry sites = IRES) 
10 derived from the 5'-UTR's of poliovirus, which mediate a cap-independed 
internal initiation of translation, are interspersed between the cistrons. 

In a preferred embodiment according to the invention the tricistronic mRNA 
is transcribed from any strong promoter such as a single hybrid MPSV/CMV 
1 5 promoter/enhancer. 

In a further preferred embodiment the selection marker may be puromycin 
acetyl transferase, neomycin phosphotransferase or procaryotic genes such as 
the B-subunit of tryptophane synthase (trpB) derived from E. coli or the 
20 histidinol dehydrogenase (hisD) of Salmonella typhirnurium or any 
resistance marker known in the art. The selection marker is preferably 
located in the promoter-distal position to ensure stable expression of the 
entire cistron. 

25 In another preferred embodiment of the invention expression is further 
enhanced by inclusion of one or two, preferably two, scaffold/matrix- 
attached regions (SAR/MAR elements) into the expression vector. 
Expression can be synergistically by SAR/MAR elements and butyrat added 
to the medium. 
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In another preferred embodiment of the invention the protein sequence 
between both parts of the fusion protein can be shortened up to a limit where 
the biologically active ligand looses its activity. 

In another preferred embodiment of the invention both parts of the fusion 
protein can be combined by introducing linker sequences which consist of 
repetitive units containing preferentially the amino acids alanin, glycin and 



serin. 



Furthermore, it is an objective of the invention to manufacture said proteins such 
as immunoconjugates by transfering the expression vector which contains the 
tricistronic construct into appropriate host cells such as BHK-2I cells, CHO cells, 
SP2/0 cells or myeloma cells. 

Generation of fusion protein constructs consisting of mAb425 and cytokines 
or chemokines has been disclosed in EP 0659 439 and EP 0706 799, 
respectively. Fusion proteins have been constructed on the basis of chimeric 
and humanized mAb425 with cDNAs encoding cytokines such as 1L-2, IL-4, 
IL-7 and TNFct or chemokines such as IL-8 and MIP-2cx and Mip2-B fused 
to the CHI, or CH2 or CI 13 domain of the constant reg.on of the mAb425 
heavy chain, respectively. The techniques used can be taken, for example 
from the two European patent publications indicated above which are 
incorporated in this application by reference. 

The vector system according to the invention leads to an new and innovative 
production system for high expression of heterodimeric proteins in 
eucaryotic cells such as antibody-cytokine/chemokine fusion proteins. Light 
chain and heavy-chain cytokine/chemokine fusion are transcribed together 
with a selectable marker from one tricistronic mRNA. The advantage of this 
system is twofold: First, unpredictable overexpression of one of both chains 
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which frequently leads to instability of production and purification problems 
will be avoided because both chains will be produced at equimolar amounts. 
Secondly, coupling of product and selection marker in the promoter-distal 
5 position guarantees stable and longterm expression of the product. Taken 

together, the system described herein represents a robust process for 
production of complex proteins in eucaryotic cells employing different 
fermentation techniques. 

10 Introduction of vector constructs for the expression of a monovalent 
immunoconjugate including only the CHI domain or divalent 
immunoconjugates including the CHI and CH2 and CH3 domains into host 
cells can be achieved by electroporation, DEAE dextrane, calcium 
phosphate, Lipofectin, protoplast fusion or any known method in the art. 

15 Any host cell type may be used provided that the recombinant DNA 
sequences encoding the immunoconjugate and the appropriate light chain are 
properly transcribed into mRNA in that cell type. Host cells may be mouse 
myeloma cells which do not produce immunoglobulin such as Sp2/0-AG14 
(AI£C CRL 1581), NSO (Gaffe & Milstein, 1991, Meth. Enzymol. 

20 73(B):3), P3X63 Ag8.653 ( ATCC CRL 1 580) or hamster cells such as CHO- 
Kl ( ATCC CCL 61), or CHO/dhFr- ( ATCC CRL 9096), or BHK-21 (ATCC 
CCL 10). Selection for transfected host cells is done in the presence of the 
selection marker encoded by the third cistron of the tricistronic expression 
vector. Clones are analyzed for expression of immunoconjugates by EGF- 

25 receptor or cytokine-specific ELI S As. Selected clones arc then further 
purified by limiting dilution cloning. 
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Example 



Example 1 

Generation of basic vectors 

The vectors pSBC-1 and pSBC-2 (Dirks e, a!., ,993, Gene ,28:247) have 
been deve.oped as monocistronic expression vector, Both vectors contain 
the SV40 origin of replication, the SV40 early promoter, the SV40 19. splice 
donor and .9s acceptor, the SV40 polyadenylauon signal, procaryotic 
sequences such as the origin of replication from ColE, and the Ampicillin 
reststance gene. ,„ addition pSBC- 1 contains the internal ribosoma, entry site 
sequence (IRES, of polio virus for the generanon of dicistronic messenger 
RNAs when appropriately combined with pSBC-2. pSBC vectors were 
altered by replacing the promoter fragment (Clal/Xhol) by a hybrid 
promoter/enhancer composed of an MPSV enhancer of 300 bp (Cla,/Xbal> 
(Dirks e, al.. Gene 128:247, ,993) and a PGR amplified huCMVpromoter 
fragment with Xba, and Xho, ends (bp 220-807 from HEIEE EMBL 
database) and by replacing the EcoRJ-Hindl, polylinker by a Hindlll-EcoRJ 
polylinker to give oMC-1 (Fig. ,A> and pMc . 2 (Fig . IB) , rcspec , jve , y 
Based on these vectors a se, of vectors have been generated which allow 
generate of tricstronic expression vectors in a straightforward cloning 
strategy. The vectors pMC-1 and pMCC, (Fig. ,C) are identical except for 
the multi-cloning sites to facilitate insertion of restriction fragments, .n these 
vectors the promoter-proximal cistron has ,„ be inserted. p MC-2 and pMCC 
2 (F.g. ID) are also identical cx« pl for the multi-cloning site and allow 
express,on of one protein chain, but do no, contain a selection marker 
The vector pMC-2P (Fig. , E ) was created in several steps. Firs,, the b.unt- 
ended fragmen, of ,hc puromycin resistance gene from pSV2pac (Vara e« al 
1986, Nucl. Acid Res. 14:4617) was cloned into the NotI site of pMCC-1 In 
the resulting plasmid the Xba./EcoRl was replaced by the analogous 
fragmen, from pMCC2, thereby inserting a new Not, site. The resulting 
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plasmid is called pMCC-2P (Fig. IF). pMC-2P was created by exchanging 
the polylinker into an Hindlll/EcoRl polylinker. pMC-2PS (Fig- 1G) was 
created by insertion of a scaffold-attached region sequence (SAR) of 800 bp 

5 from the human lnterferon-fi gene as described (Mielke et al. 1990, 

Biochemistry 29:7475). All three vectors contain an IRES sequence followed 
by the selection marker, in this case Puromycin resistance. 
After cloning of the respective DNA fragments encoding the protein chains 
to be expressed into the appropriate vectors generation of a tricistrion 

10 expression vector is performed as follows: A Clal/NotI restriction fragment 
containing the promoter-proximal cistron followed by an IRES sequence is 
derived from the vectors pMC-1 or pMCC-1, respectively. A Notl/Clal 
restriction fragment containing the second cistron followed by an IRES 
sequence and the selection marker is derived from the vectors pMCC-2P, 

15 pMCC-2, pMC-2P, and pMC-2PS. By combination of these two fragments a 
complete expression vector is generated. 

Example 2 

Cells and gene transfer 

20 BHK-21 cells (A subclone of ATCC number CCL-10) were cultivated in 
DMEM supplemented with 10 % fetal calf serum (FCS), 20mM glutamine, 
60 jig/ml penicillin and 100 fig/ml streptomycin. 

Calcium phosphate transfections were carried out essentially as described 
before (Mielke et al. 1990, Biochemistry :29:74 74). Minimally 5 \xg of uncut 
25 plasmids were used without the addition of carrier DNA. Stable transfectants 
were selected and - where indicated - cultivated in medium containing 
puromycin (Sigma) at a final concentration where only cells expressing the 
Puromycin resistance marker can grow, e.g. 5 |-tg/ml for BHK-21 cells. 
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Example 3 

Quantification of secreted antibody 

10 6 cells/ml were seeded on 25 cm 2 culture flasks in serum free medium and 
incubated for 24 hours. Medium samples of these cultures were taken for 
quantification of secreted IgG-chains in a specific ELISA. For this purpose, 
96 well immunoplates (Nunc) were coated with an affinity purified goat-anti- 
human IgG antibody (Fab' specific, Sigma* 1-5260). After incubation with 
serial dilutions of medium samples, the bound antibody contained in these 
samples was detected by application of a peroxidase-conjugated affinity pure 
goat-anti-human IgG antibody (Dianova# 109-035-088) and subsequent 
staining with ortho-Phenyldiamine-dihydrochloridc (OPD)/H 2 0,. 
Quantification was made possible by simultaneous application of an IgCJ- 
standard (human IgGl/kappa, Sigma #13889). No unspecific background was 
detectable under these conditions as shown by use of medium supematants 
of untransfected cells. 



Example 4 

Production of mAb425CHl-IL2 fusion protein 
Generation of a tricistronic expression vector 

Generation of the DNA sequence encoding the mAb425CHML2 fusion 
protein has been disclosed in EP 0659 439 and EP 0706 799. A 
Hindlll/EcoRI fragment containing the entire mAb425CHML-2 heavy chain 
was ligated into the multi-cloning site of the pMC2PSAH vector. The 
Notl/Clal fragment of this construct was ligated with the Clal/NotI fragment 
from pMCLAHAP containing the mAb425 light chain. The resulting 
construct contains the light chain in the promoter-proximal position followed 
by the heavy-cham-IL-2 fusion and the Puromycin resistance. The genes are 
interspersed by two IRES sequences to allow transcription of all three 
30 cistrons into one messenger RNA. 
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Establishment of a recombinant BHK-21 cell line producing 
mAb425CHl-IL2 fusion protein 

BHK-21 (A TCC CCL 10) were transfected with the tricistronic expression 
5 vector encoding mAb425CHl-IL2 fusion protein by the calcium phosphate 

method with a kit commercially available (InVitrogen) according to the 
manufacturer's instructions. Selection for transfected BHK-21A cells was 
done in the presence of 5 \ig/m\ Puromycin (Sigma). Clones are analyzed for 
expression of immunoconjugates by EGF-receptor or cytokine-specific 
10 ELISAs. Selected clones are then further purified by limiting dilution 
cloning. In the presence of Puromycin a lot of clones could be isolated which 
stably express the mAb425CHl-IL2 fusion protein. Three examples are 
shown in Fig. 2). 



Example 5 

Expression of a m Ab425CH 1 -TNFa fusio n protein 
Generation of a tricistronic expression vector 

Generation of the DNA sequence encoding the mAb425CHl-TNFa fusion 
protein has been disclosed in EP 0659 439 and EP 0706 799. The heavy 
chain-TNFct fusion gene construct was generated on the basis of the heavy 
chain-IL-2 fusion gene. The KpnI/EcoRI fragment containing part of the 
heavy chain variable region, the CHI domain and IL-2 was subcloned into 
pUC19. In this construct the NcoI/EcoRI fragment containing the IL-2- 
encoding sequences was exchanged with the NcoI/EcoRI fragment 
containing the TNFa-encoding sequences. The KpnI/EcoRI fragment of this 
construct was combined in pUC18 with the HindHI/Kpnl fragment 
containing the 5 'part of the heavy chain variable region to generate the full 
length heavy chain-TNFct fusion gene. The HindlH/EcoRI fragment was 
ligated into the multi-cloning site of the pMC2PSAH vector. The Notl/Clal 
fragment of this construct was ligated with the Clal/NotI fragment from 
pMCLAHAP containing the mAb425 light chain. The resulting construct 



BNSDOCID: <WO 9811241A1J_> 



WO 98/11241 



-28- 



PCT/EP97AM765 



10 



contains the light chain in the promoter-proximal position followed by the 
heavy-chain-TNFcc fusion and the Puromycin resistance. The genes are 
interspersed by two IRES sequences to allow transcription of all three 
cistrons into one messenger RNA. 

Establishment of a recombinant BHK-21 cell line producing 
mAb425CHl-TNFa fusion protein 

The establishment of a recombinant BHK-21 cell line producing 
m Ab425CH 1 -TNFct fusion protein has been performed as described in 
example 5 for mAb425CH 1-11-2 fusion protein. We could isolate several 
clones which stably express the mAb425CHl-TNFa fusion protein for more 
than 12 weeks even without selection pressure. One example is shown in 
Fig. 3 



15 Example 6 

Expression of a mAh42sem-i L .2 f.ninn ^ trm 
Generation of a tricistronic expression vector 

Generation of the DNA sequence encoding the mAb425CH3-IL-2 fusion 
protein has been disclosed in EP 0659 439 and EP 0706 799. The 
Hindlll/EcoRl fragment containing the complete heavy chain-IL-2 fusion 
gene was cloned into the multi-cloning vector P MC-2P. The Notl/Clal 
fragment of this construct was ligated with the Clal/NotI fragment from 
pMCLAHAP containing the mAb425 light chain. The resulting construct 
contains the light chain in the promoter-proximal position followed by the 
heavy-chain-TNFcc fusion and the Puromycin resistance. The genes are 
interspersed by two IRES sequences to allow transcription of all three 
cistrons into one messenger RNA. 

Establishment of a recombinant BHK-21 cell line producing 
30 mAb425CH3-IL-2 fusion protein 



20 



25 



BNSDOCID: <WO 981 1241A1_I_> 



WO 98/11241 



-29- 



PCT/EP97/04765 



Stable BHK-21 cell lines expressing mAb425CH3-IL-2 fusion protein have 
been established as described in example 5. Several clones could be isolated 
which stably express the mAb425CH3-IL-2 fusion protein for several weeks 
5 even in the absence of selection. One example is shown in Fig, 4 

Purification of m Ab425CH3-IL2 

Transfected BHK cells (rBHK2 1 A-CH3-IL2/K69-8) were fermented in a 
semicontinous manner and the fusion protein was isolated from the collected, 

10 cell free supernatant. 

The first purification step was performed by affinity chromatography on 
carrier bound ProteinA (Pharmacia) using the extended bed technology. The 
starting conditions were 0,1 M phosphate buffer, pH 8,5. Impurities were 
removed with 0,2 M glycin buffer, pH 5,0 and subsequently, the fusion 

15 protein was eluted from the sedimented gel bed with 0,2 M glycin buffer, pH 
3,3. The pH of the eluate was immediately neutralized by adding 10 % 
(vol./vol.) 1 M TRIS solution and brought up to pH 8 - 8,5. 
In a second purification step further impurities were separated by cation 
exchange chromatography on Fractogel EMD SO3- 650(S) (Merck). The 

20 starting conditions were 10 mM phosphate buffer, pH 6,0 (conductivity 2 
mS). The fusion protein was eluted with a NaCl-gradient 0 -0,6 M NaCl). 
The final purification step was done by size exclusion chromatography on 
Fractogel BioSEC 650(S) (Merck) in PBS, pH 7,4. Up to 5 % aggregates and 
small amounts of impurities with smaller molecular weight were separated. 

25 Concentration and diafiltration were done by ultrafiltration (Amicon). 
Membranes with a cut-off of 30 kDa were used. 

Detection of protein-containing fractions was done by SDS-PAGE and an 
ELISA specific for human Ig with affinity-purified goat anti-human Fc as 
catcher antibody and affinity-purified goat anti human anti F(ab)2 coupled to 
*0 alkaline phospatase for detection (both Dianova). 
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The protein content of the preparation was about 1 mg/m l. The endotoxin 
content was < 1 EU/mg fusion protein. The purity of the protein preparation 
could be demonstrated by SDS Page (Fig. 5). In Western Blots identity of 
heavy and light chain could be verified (data not shown). 
Functional analysis of recombinant mAb425CH3-IL-2 fusion protein 
FACS analysis with EGF-R-positive cells showed that binding of the 
antibody portion is identical to a mAb425 control (Fig. 6). Furthermore IL-2 
activity is indistinguishable from the activity of recombinant IL-2 (Fig. 7) 
indicating that interaction of the fusion protein with the IL-2 receptor is not 
impaired in the fusion protein. Taken together, it can be concluded that the 
expression system described herein provides high amounts of the 
mAb425CH3-IL-2 fusion protein which is fully active with respect to 
antigen binding and IL-2 activity. 

Example 7 

Expression of a mAh47.'>rH3-TlVir,v r.. ? ;„ n 
Generation of a tricistronic expression vector 

The PGR amplified coding region of the recombinant light chain (HindHI- 
EcoRI) gene was inserted into pMC-l at the polylinker site. The puromycin 
resistance gene coding sequence was inserted between the IRES sequence 
and the polyadenylation site of pMC-2 to give P MC-21>. The heavy chain- 
cytokine fusion protein genes were inserted into the polylinker sequence of 
PMC-2P. The Xmnl/Notl fragments of both Immunoglobulin chain vectors 
were combined to give e.g. pMCLDHAP, a 8298 bp tricistronic expression 
vector for IgG-TNF-alpha and puromycin acetyltransferase (Fig. 8). 
Establishment of a recombinant BHK-21 cell Hne producing 
mAb425CH3-TNFa fusion protein 

BHK-21 cells were transfected with the tricistronic expression vector 
encoding mAb425CH3-TNFa fusion protein using the calcium phosphate 
prestation method as detailed by Mielke et al. (1990, Biochemistry 
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29:7475). 5 \xg of uncut plasmid were used without the addition of carrier 
DNA. Stable transfectants were selected and cultivated in medium 
containing Puromycin (Sigma) at a final concentration of 5 ^g/ml. Clones 
are analysed for expression of immunoconjugates by IgG-specific ELISA. 
Selected clones were further purified by limiting dilution cloning. We could 
isolate several clones which stably express mAb425CH3-TNFa fusion 
protein even in the absence of selection. One example is shown in Fig. 9. 
Chromosomal DNA analysis 

Isolation of genomic DNA: Cells from a 141 cm 2 culture dish were harvested 
in 20 ml TEN buffer [40mM Tris/HCl (pH 7.5), ImM EDTA, 150 mM 
NaCL], split into two portions and pelleted for 5 min at 1000 rpm in a table 
top centrifuge. One of these cell pellets was intensively resuspended in 1 ml 
of TEN and then provided with 1ml of 2x extraction buffer [20mM tris/HCl 
(pH 8), 200 mM EDTA, 1 % SDS, 40 ^ig/ml Rnase A] . After 5 h of 
incubation at 37 ° C, 50 jal Proteinase K solution (20 mg/ml) was added and 
incubation was continued over night. Following a standard phenolization 
step, the DNA solution was dialyzed against TE and was then used without 
any further precipitation steps. 

Southern Blots/Methylation pattern: 20jag of genomic DNA was digested 
over night with the indicated restriction enzyme in a total volume of 500|al, 
precipitated by addition of 300 \xl 2-propanol and pelleted at 13000 rpm, 4 ° 
C in a microcentrifuge. DNA pellets were carefully resuspended in 40|al of 
lx loading buffer [2.5 % Ficoll (Type 400), 17 mM EDTA, o.ol % Xylene 
Cyanol FF), 20jal were applied on a 0.8 % TAE agarose gel and 
electrophoresed. Gels were then blotted onto nylon membranes (Zeta probe, 
Biorad) with 0.4 M NaOH over night and membranes were then hybridized 
to the indicated radiolabeled (Rediprime, Amersham) DNA probes 
according to manufacturers recommendations and following the protocol of 
Church and Gilbert (Church, G.M. and Gilbert, W. (1984), PNAS 81, 1991 - 
1995). (Fig.10) 
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Purification of mAb425CH3-TNFa 

Transformed BHK cells (rBHK2 1 A-CH3-TNFa/SC7.4) were fermented in a 
sem,co„,inous ma^er and .he fusion protein was iso.ated from the collected, cell 

free supernatant. 

The firs, puriftcauon step performed by affinity chromatography on carrier bound 
ProtemA (Pharmacia) using the extended bed technology. The starting conditions 
were 0,1 M phosphate buffer, P H 8,5. Impurities were removed with 0,2 M glycin 
buffer, pH 5,0 before the fusion protein was eluted from the sedimented gel bed 
wth 0,2 M glycin buffer, pH 3.3. The pH of the elua, was immediate* brought up 
to pH 8 - 8,5 by adding 10 % <voI./vol.) 1 M TRIS solution. 

The second purification step was done by chromatography on hydroxyapatite 
(Merck). The starting conditions were 5 mM phosphate, P H 7,0. The elution was 
performed with a phosphate gradient (5 - 500 mM). 

The final purification step was done by size exclusion chromatography on 
Fractogel BioSEC 650(S) (Merck) in PBS, pH 7,4 as described above Up to 5 % 
aggregates and small amounts of impurities with smaller molecular weigh, were 

separated. 

Concentration and diafiltrat.on were done by ultrafiltration. Membranes with a cut- 

off of 30 kDa were used. 

Detection of protein-contaimng fractions was done by SDS-PAGE and an ELISA 
specific for human Ig with affinity-purified goat anti-human Fc as catcher antibody 
and affln.ty-purified goat anti human anti F(ab) 2 coupled to alkaline phospatase for 
detection (both Dianova). 

The protein content of the preparation was about 1 m g / m l. The endotoxin content 
was < 1 EU/mg fusion protein. 

Assessment of functionality of m Ab425CH3-TNFoc fusion protein 

The functionality of mAb425CH3-TNFa with respect to antigen binding was 
demonstrated by FACS analysis (Fig. H). The fusion protein does bind to 
EGF-R-positive cells with the same quality as the mAb425 control antibody 
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TNFa activity of the mAb425CH3-TNFa fusion protein was investigated on 
different human tumor cell lines. MCF7 is a human mamma carcinoma cell 
line which is not EGF-R positive. The inhibition of proliferation is therefore 
exclusively based on TNFa activity. As demonstrated in Fig. 12 the growth 
inhibition induced by the mAb425CH3-TNFa fusion protein is virtually 
identical to that of recombinant TNFa. mAM25 alone does not have any 
effect on proliferation of MCF7. 

mAb425 was raised against the human carcinoma cell line A43 1 which is 
highly positive for EGF-R expression (Rodeck et al.). It was demonstrated 
previously that mAb425 is internalized upon binding to A431 cells. A431 is 
not TNFa sensitive and neither mAb425CII3-TNFa fusion protein nor the 
combination of mAb425 and recombinant TNFa does have any effect on the 
growth of A431 cells (Fig. 13) indicating that the growth inhibition 
specifically requires expression of TNFa receptors. Lack of TNFa receptors 
cannot be overcome through internalization of mAb425CH3-TNFa fusion 
protein mediated by EGF-R receptor. 

BT20, a human mamma carcinoma cell line and C8 161, a human melanoma 
cell line, are both EGF-R positive and TNFa sensitive. The density of EGF- 
R on the cell surface is higher on BT20 than on C8161 as determined by 
FACS analysis (data not shown). The proliferation of both cell lines is 
strongly inhibited by mAb425CH3-TNFa fusion protein (Fig. 14). The 
effect is more pronounced on BT20 cells than on C8161, which might be due 
to the increased EGF-R expression which leads to a higher crosslinking of 
TNFa receptors and thus increased signal transduction. These experiments 
clearly demonstrate the superiority of the mAb425CH3-TNFa fusion protein 
when compared to the combination of mAb425 and TNFa. This could be 
explained by the crosslinking of TNFa receptors on one side due to capping 
of EGF-R on the other side. Thereby signal transduction is maximally 
30 enhanced. 



20 



25 



BNSDOCID: <WO 981 1241A1_I_> 



WO 98/11241 



PCT7EP97/04765 



34 



15 



40 



50 



gfiQVENCE LISTING 



( 1 ) GENERAL INFORMATION : 



(i) APPLICANT : 

(A) NAME: Merck Patent GmbH 

(B) STREET: Frankfurter Str. 250 

(C) CITY: Darmstadt 
10 { E } COUNTRY : Germany 

(F) POSTAL CODE { ZIP) : 64271 

(G) TELEPHONE: 49-6151-72-7022 

(H) TELEFAX: 4 9-6151-72-7191 



(ii) TITLE OF INVENTION: Oligocistronic Expression Svster. fcr the 
Production of Antibody Fusion Proteins 



(iii; NUMBER OF SEQUENCES: 6 

20 (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 
^ 5 (D) SOFTWARE: Patentln Release #1.0, Version *l.3G (EPO) 

(2) INFORMATION FOR SEQ ID NO : 1: 

<i> SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 82 9B base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL : NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: N- terminal 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: humanized mAb42 5 - TNFalpha Fusion orot-m 

45 (B) STRAIN: E. coli K12 

(G) CELL TYPE: Fibroblast 

(H) CELL LINE: BHK-21 



(ix) FEATURE: 

(A) NAME /KEY : promoter 

(B) LOCATION : 1 . . 904 

(D) OTHER INFORMATION :/function= "Enhancer / oromoce^ 
MPSV/CMV" 
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(ix) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION : 905 . .976 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 977 . . 101S 

(D) OTHER INFORMATION : /product = "leader sequence (part)" 

(xx) FEATURE: 

(A) NAME/KEY ; intron 
<B) LOCATION: 1019. .1106 

(D) OTHER INFORMATION: /function= "5'UTR poliovirus" 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1107. .1433 

(D) OTHER INFORMATION: /f unction = " FRs , CDRs" 

/product= "light chain hmAb425, variable region, 
plus leader (rest) " 

(ix) FEATURE: 

(A) NAME/KEY : intron 

(B) LOCATION: 1434. .1595 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1596 . . 1913 

(D) OTHER INFORMATION: /product = "light chain hmAb425, 
constant region" 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1914. .2581 

(D) OTHER INFORMATION: /product = "S'UTR from poliovirus + 
IRES (2029-2159) + intron" 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION : 25 82 . .4537 

(D) OTHER INFORMATION: /product = "Fusion protein: heavy 
chain hmAb425 + TNFalpha (from 4 064)" 

(ix) FEATURE: 

(A) NAME /KEY : misc_RNA 

(B) LOCATION : 4 565 . .5279 

(D) OTHER INFORMATION : /product= "S'UTR from polivirus 
plus IRES plus intron" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION :5280. .5876 
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(D) OTHER INFORMATION: /functions "selection mark— 
/product= "puromyciri acetyl transferase" 

(ix) FEATURE: 
5 (A) NAME /KEY : misc_RNA 

(B) LOCATION: 5877 .. 8298 

(D) OTHER INFORMATION : /product = "DNA seouence como-sxna 
SV40 PolyA (S929-61B1)" " comp__sxng 

/standard_name= "SV4 0 PolyA" 

(Xi) SEOUENCE DESCRIPTION: SEQ ID NO : 1: 

^ TCGATAATGA AAGACCCCAC CTGTAGGTTT GGCAAGCTAG CTTAAGTAAC GCCATTTTGC 6 

AAGGCATGGG AAAAATACAT AACTGAGAAT AGAGAAGTTC AGATCAAGGT CAGGAACAGA 12 , 

GAAACAGGAG AATATGGGCC AAACAGGATA TCTG TG G TAA GCAGTTCCTG CCCCGC7CAG 18, 

20 GGCCAAGAAC AGTTGGAACA GGAGAATTGG GCCAAACAGG ATATCTGTGG TAAGCAGTTC 24, 

CTGCCCCGCT CAGGGCCAAG AACAGATGGT CCCCAGATGC GGTCCCGCCC TCAG CAGTTT 30C 

25 CTAGACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT 36C 

GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA 42 C 

ATGGGTGGAG TATTTACGGT AAACTGCCCA CTTGG CAGTA CATCAAGTGT ATCATATGCC 480 

30 AAGTACGCCC CCTATTGACG TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA 540 

CATGACCTTA TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTAT7AC 600 
^ CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG 
ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG TTTTGGCAC C AAAAT C AAC G 
GGACTTTCCA AAATGTCGTA ACAACTCCGC CCCATTGACG CAAATGGGCG GTAGG CGTGT 

40 ACGGTGGGAG GTCTATATAA GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG 840 

CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCGAGGAACT 
^ GG AAAAC C AG AAAGTTAACT GGTAAGTTTA GTCTTTTTGT CTTTTATTTC AGGTCCCGGA 



ATTAAGCTTC GCCACC ATG GGA TGG AGC TGT ATC ATC CTC TTC TTG GTA 

Met Gly Trp Ser cys il e n e Leu Phe Leu val 

1 c 



1009 



5 10 



50 tit t*£ £1 ACAGGTAAGG ^"cacagt agcaggcttg AGGTCTGGAC 
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ATATATATGG GTGACAATGA CATCCACTTT GCCTTTCTCT CCACAGGT GTC CAC TCC 1115 

Val His Ser 



5 GAC ATC CAG ATG ACC CAG AGC CCA AGC AGC CTG AGC GCC AGC GTG GGT 1163 
Asp He Gin Met Tiir Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly 
5 10 15 

GAC AGA GTG ACC ATC ACC TGT AGT GCC AGC TCA AGT GTA ACT TAC ATG 1211 
10 Asp Arg Val Thr He Thr Cys Ser Ala Ser Ser Ser Val Thr Tyr Met 
20 25 30 35 

TAT TGG TAC CAG CAG AAG CCA GGT AAG GCT CCA AAG CTG CTG ATC TAC 1259 
Tyr Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr 
15 40 45 50 

GAC ACA TCC AAC CTG GCT TCT GGT GTG CCA AGC AGA TTC AGC GGT AGC 13 07 

Asp Thr Ser Asn Leu Ala Ser Gly Val Pro Ser Arg Phe Ser Gly Ser 
55 60 65 



20 



GGT AGC GGT ACC GAC TAC ACC TTC ACC ATC AGC AGC CTC CAG CCA GAG 13 5 5 

Gly Ser Gly Thr Asp Tyr Thr Phe Thr He Ser Ser Leu Gin Pro Glu 
70 75 80 



25 GAC ATC GCC ACC TAC TAC TGC CAG CAG TGG AGT AGT CAC ATA TTC ACG 14 03 

Asp He Ala Thr Tyr Tyr Cys Gin Gin Trp Ser Ser His lie Phe Thr 
85 90 95 

TTC GGC CAA GGG ACC AAG GTG GAA ATC AAA CGTGAGTAGA AT TTAAACTT 14 5 3 
30 Phe Gly Gin Gly Thr Lys Val Glu He Lys 
100 105 

TGCTTCCTCA GTTGGATCCA TCTGGGATAA GCATGCTGTT TTCTGTCTGT CCCTAACATG 1513 

35 CCCTGTGATT ATGCGCAAAC AACACACCCA AGGGCAGAAC TTTGTTACT7 AAACACCATC 15 73 

CTGTTTGCTT CTTTCCTCAG GA ACT GTG GCT GCA CCA TCT GTC TTC ATC TTC 162 5 

Thr Val Ala Ala Pro Ser Val Phe He Phe 
15 10 

40 

CCG CCA TCT GAT GAG CAG TTG AAA TCT GGA ACT GCC TCT GTT GTG TGC 1673 
Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala Ser Val Val Cys 
15 20 25 

45 CTG CTG AAT AAC TTC TAT CCC AGA GAG GCC AAA GTA CAG TGG AAG GTG 1721 
Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val Gin Trp Lys Val 
30 35 40 

GAT AAC GCC CTC CAA TCG GGT AAC TCC CAG GAG AGT GTC ACA GAG CAG 176 9 

50 Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser Val Thr Glu Gin 
45 50 55 
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GAC AGC AAG GAC AGC ACC TAC AGC CTC AGC AGC ACC CTG ACG CTG AG- 
Asp Ser Lys Asp Ser Thr Tyr Ser Leu £er Ser Thr Leu Thr Leu 
60 65 70 

5 AAA GCA GAC TAC GAG AAA CAC AAA GTC TAC GCC TGC GAA GTC ACC CAT 
Lys Ala Asp Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val Thr His 
75 80 85 90 

CAG GGC CTG AGC TCG CCC GTC ACA AAG AGC TTC AAC AGG GGA GAG TG~ 
10 Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 

95 100 105 

TAGAATTCAG CTTTTAAAAC AGCTCTGGGG TTGTACCCAC CCCAGAGGCC CACGTG3CGC- 

15 CTAGTACTCC GGTATTGCGG TACCCTTGTA CGCCTGTTTT ATACTCCCTT CCCGTAACTT 

AGACGCACAA AACCAAGTTC AATAGAAGGG GGTACAAACC AGTACCACCA CGAACAAGCA 

^ CTTCTGTTTC CCCGGTGATG TCGTATAGAC TGCTTGCGTG GTTGAAAGCG ACGGATCC3T 

TATCCG CTT A TGTACTTCGA GAAGCCCAGT ACCACCTCGG AATCTTCGAT GCGTTGCGCT 

CAGCACTCAA CCCCAGAGTG TAGCTTAGGC TGATGAGTCT GGACATCCCT CACCGGTGAC 

25 GGTGGTCCAG GCTGCGTTGG CGGCCTACCT ATGG CTAACG CCATGGGACG CTAGTTGTGA 

ACAAGG TGTG AAGAGCCTAT TG AG CT AC AT AAGAATCCTC CGGCCCCTGA ATGCGGCTAA 

TCCCAACCTC GG AG CAGGTG GTCACAAACC AGTGATTGGC CTGTCGTAAC GCGCAAGTCC 

GTGGCGGAAC CGACTACTTT GGGTGTCCGT GTTTCCTTTT ATTTTATTGT GGCTGC77A7 

GGTGACAATC ACAGATTGTT ATCATAAAGC GAATTGGATT GCGGCCGCGA ATTAAGCTTr- 

CCGCCACC ATG GAC TGG ACC TGG CGC GTG TTT TGC CTG CTC GCC GTG G™ 
Met Asp Trp Thr Trp Arg Val Phe Cys Leu Leu Ala Va^ Ala 
1 5 10 " 



30 



35 



15 20 25 



30 



AAG AAA CCC GGT GCT TCC GTG AAG GTG AGC TGT AAA GCT AGC GGT TA- 

Lys Lys Pro Gly Ala Ser Val Lys Val Ser Cys Lys Ala Ser Gly T— 
45 35 40 4 J 

ACC TTC ACA TCC CAC TGG ATG CAT TGG GTT AGA CAG GCC CCA GGC CAA 

Thr Phe Thr Ser His Trp Met His Trp Val Arg Gin Ala Pro Gly G^n 

50 

GGG CTC GAG TGG ATT GGC GAG TTC AAC CCT TCA AAT GGC CGG ACA AAT 

Gly Leu Glu Trp lie Gly Glu Phe Asn Pro Ser Asn Gly Arg Thr Asn 
65 70 75 



1817 

1865 

1913 

1973 

2033 

2093 

2153 

2213 

2273 

2333 

2393 

2453 

2513 

2573 

2623 



CCT GGG GCC CAC AGC CAG GTG CAA CTA GTG CAG TCC GGC GCC GAA GTG 2 6 71 

40 Pro Gly Ala His Ser Gin Val Gin Leu Val Gin Ser Gly Ala Glu Val 



2719 



2767 



2815 
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TAT AAC GAG AAG TTT AAG AGC AAG GCT ACC ATG ACC GTG GAC ACC TCT 2 863 

Tyr Asn Glu Lys Phe Lys Ser Lys Ala Thr Met Thr Val Asp Thr Ser 
80 85 90 

5 ACA AAC ACC GCC TAC ATG GAA CTG TCC AGC CTG CGC TCC GAG GAC ACT 2911 
Thr Asn Thr Ala Tyr Met Glu Leu Ser Ser Leu Arg Ser Glu Asp Thr 
95 100 105 110 

GCA GTC TAC TAC TGC GCC TCA CGG GAT TAC GAT TAC GAT GGC AGA TAC 2 95 9 

10 Ala Val Tyr Tyr Cys Ala Ser Arg Asp Tyr Asp Tyr Asp Gly Arg Tyr 

115 120 125 

TTC GAC TAT TGG GGA CAG GGT ACC CTT GTC ACC GTC AGT TCA GGT GAG 3 007 

Phe Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser Gly Glu 
15 130 135 140 

TGG ATC CTC TGC GCC TGG GCC CAG CTC TGT CCC ACA CCG CGG TCA CAT 3 05 5 

Trp lie Leu Cys Ala Trp Ala Gin Leu Cys Pro Thr Pro Arg Ser His 

145 150 155 

20 

GGC ACC ACC TCT CTT GCA GCC TCC ACC AAG GGC CCA TCG GTC TTC CCC 3103 

Gly Thr Thr Ser Leu Ala Ala Ser Thr Lys Gly Pro Ser Val Phe Pro 

160 165 170 

25 CTG GCA CCC TCC TCC AAG AGC ACC TCT GGG GGC ACA GCG GCC CTG GGC 3151 
Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly 
175 180 185 190 

TGC CTG GTC AAG GAC TAC TTC CCC GAA CCG GTG ACG GTG TCG TGG AAC 3199 
30 Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn 

195 200 205 

TCA GGC GCC CTG ACC AGC GGC GTG CAC ACC TTC CCG GCT GTC CTA CAG 324 7 

Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gin 
35 210 215 220 

TCC TCA GGA CTC TAC TCC CTC AGC AGC GTG GTG ACC GTG CCC TCC AGC 3295 

Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser 
225 230 235 

40 

AGC TTG GGC ACC CAG ACC TAC ATC TGC AAC GTG AAT CAC AAG CCC AGC 3 34 3 

Ser Leu Gly Thr Gin Thr Tyr lie Cys Asn Val Asn His Lys Pro Ser 
240 245 250 

45 AAC ACC AAG GTG GAC AAG AAA GTT GAG CCC AAA TCT TGT GAC AAA ACT 3 3 91 

Asn Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Asp Lys Thr 
255 260 265 270 

CAC ACA TGC CCA CCG TGC CCA GCA CCT GAA CTC CTG GGG GGA CCG TCA 343 9 

50 His Thr Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser 

275 280 285 
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GTC TTC CTC TTC CCC CCA AAA CCC AAG GAC ACC CTC ATG ATC TCC CG-~- 
val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met He s-r A-o 
290 295 3o 0 " 

ACC CCT GAG GTC ACA TGC GTG GTG GTG GAC GTG AGC CAC GAA GAC C~ 
Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp P^ 
305 310 P - 



20 



45 



AGC GTC CTC ACC GTC CTG CAC CAG GAC TGG CTG AAT GGC AAG GAG TA" 
Ser Val Leu Thr Val Leu His Gin Asp Trp Leu Asn Gly Lys Glu Tyr 



355 360 



36: 



425 



430 



AAT GGG CAG CCG GAG AAC AAC TAC AAG ACC ACG CCT CCC GTG CTG GAC 
Asn Gly Gin Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu AsT 



n 3p 

4 0 435 44 ° 445 



TCC GAC GGC TCC TTC TTC CTC TAC AGC AAG CTC ACC GTG GAC AAG AGC 

Ser Asp Gly ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp Lys s~ 

450 455 460 

AGG TGG CAG CAG GGG AAC GTC TTC TCA TGC TCC GTG ATG CAT GAG G~ 

Arg Trp Gin Gin Gly Asn Val Phe ser Cys Ser Val Met His Glu Ala 
465 470 475 



3487 



3535 



10 rf S ^ ^ II GAC GGC GTG GAG GTG CAT AAT GCC 3583 

10 Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala 
320 325 330 

AAG ACA AAG CCG CGG GAG GAG CAG TAC AAC AGC ACG TAC CGG GTG GT~ 

Lys Thr Lys Pro Arg Glu Glu Gin Tyr Asn Ser Thr Tyr Arg Val v a i 
13 335 340 345 350 



3631 



3679 



3727 



3775 



AAG TGC AAG GTC TCC AAC AAA GCC CTC CCA GCC CCC ATC GAG AAA A"" 
Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro He Glu Lys T-~ 
370 375 3ao 

25 ATC TCC AAA GCC AAA GGG CAG CCC CGA GAA CCA CAG GTG TAC ACC CTG 
He Ser Lys Ala Lys Gly Gin Pro Arg Glu Pro Gin Val Tvr Thr L°u 
385 390 395 " 

W CGG ^ GAG ACC CAG GTC AGC CTG ACC TGC 3823 

30 Pro Pro Ser Arg Asp Glu Leu Thr Lys Asn Gin Val Ser Leu Thr O s 
400 405 410 

CTG GTC AAA GGC TTC TAT CCC AGC GAC ATC GCC GTG GAG TGG GAG AG 
Leu Val Lys Gly Phe Tyr Pro Ser Asp He Ala Val Glu Trp Glu s- 
415 420 



3871 



3919 



3967 



4015 



CTG CAC AAC CAC TAC ACG CAG AAG AGC CTC TCC CTG TCT CCG GGT AAA 4 063 

50 Leu Has Asn Has Tyr Thr Gin Lys Ser Leu Ser Leu Ser Pro Gly Lvs 
480 485 4go 
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ATG GTC AGA TCA TCT TCG CGA ACC CCG AGT GAC AAG CCT GTA GCC CAT 4111 
Met Val Arg Ser Ser Ser Arg Thr Pro Ser Asp Lys Pro Val Ala His 
495 500 505 510 

5 GTT GTA GCA AAC CCT CAA GCT GAG GGG CAA CTG CAG TGG CTG AAC CGC 415 9 

Val Val Ala Asn Pro Gin Ala Glu Gly Gin Leu Gin Trp Leu Asn Arg 
515 520 S25 

CGG GCC AAT GCC CTC CTG GCC AAT GGC GTC GAG CTG AGA GAT AAC CAG 4207 
10 Arg Ala Asn Ala Leu Leu Ala Asn Gly Val Glu Leu Arg Asp Asn Gin 
530 535 540 

CTG GTG GTG CCA TCA GAG GGC CTG TAC CTC ATC TAC TCC CAG GTC CTC 4255 
Leu Val Val Pro Ser Glu Gly Leu Tyr Leu lie Tyr Ser Gin Val Leu 
15 545 550 555 

TTC AAG GGC CAA GGC TGC CCG TCG ACC CAT GTG CTC CTC ACC CAC ACC 4 3 03 

Phe Lys Gly Gin Gly Cys Pro Ser Thr His Val Leu Leu Thr His Thr 

560 565 570 

20 

ATC AGC CGC ATC GCC GTC TCC TAC CAG ACC AAG GTT AAC CTC CTC TCT 43 51 

He Ser Arg He Ala Val Ser Tyr Gin Thr Lys Val Asn Leu Leu Ser 
575 580 585 590 

25 GCC ATC AAG AGC CCC TGC CAG AGG GAG ACC CCA GAG GGG GCT GAG GCC 43 99 

Ala lie Lys Ser Pro Cys Gin Arg Glu Thr Pro Glu Gly Ala Glu Ala 
595 600 605 

AAG CCC TGG TAT GAG CCC ATC TAT CTG GGA GGG GTC TTC CAG CTC GAG 444 7 

30 Lys Pro Trp Tyr Glu Pro He Tyr Leu Gly Gly Val Phe Gin Leu Glu 
610 615 620 

AAG GGT GAC CGA CTC AGC GCT GAG ATC AAT CGG CCC GAC TAT CTC GAC 44 95 

1/ys Gly Asp Arg Leu Ser Ala Glu He Asn Arg Pro Asp Tyr Leu Asp 
35 625 630 635 

TTT GCC GAG TCC GGA CAG GTC TAC TTT GGG ATC ATT GCC CTG 4 537 
Phe Ala Glu Ser Gly Gin Val Tyr Phe Gly He He Ala Leu 
640 645 650 

40 

TGATAAGGAT CCCCGGGTAC CGAGCTCGAA TTCAGCTTTT AAAACAGCTC TGGGGTTGTA 4 5 97 

CCCACCCCAG AGGCCCACGT GGCGGCTAGT ACTCCGGTAT TGCGGTACCC TTGTACGCCT 4 6 57 

45 GTTTTATACT CCCTTCCCGT AACTTAGACG CACAAAACCA AGTTCAATAG AAGGGGGTAC 4 717 

AAACCAGTAC CACCACGAAC AAGCACTTCT GTTTCCCCGG TGATGTCGTA TAG ACTG CTT 4 777 

GCGTGGTTGA AAG CGACGGA TCCGTTATCC GCTTATGTAC TTCGAGAAGC CCAGTACCAC 4 837 

50 

CTCGGAATCT TCGATGCGTT GCGCTCAGCA CTCAACCCCA GAGTGTAGCT T AGG CTG ATG 4 8 97 

AGTCTGGACA TCCCTCACCG GTGACGGTGG TCCAGGCTGC GTTGGCGGCC TACCTATGGC 4 957 
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TAACG CCATG GG ACG CTAGT TGTGAACAAG GTGT > GAAGAG CCTATTGAGC TACATAAGAA 5017 

5 TCCTCCGGCC CCTGAATGCG GCTAATCCCA ACCTCGGAGC AGGTGGTCAC AAACCAGTGA 5077 

TTGGCCTGTC GTAACGCGCA AGTCCGTGGC GGAACCGACT ACTTTGGGTG TCCGTGTTTC 

CTTTTATTTT ATTGTGGCTG CTTATGGTGA CAATCACAGA TTGTTATCAT AAAGCGAATT 

10 GGATTGCGGC CGGCCGCCAC GACCGGTGCC GCCACCATCC CCTGACCCAC GCCCCTGACC 

CCTCACAAGG AGACGACCTT CC ATG ACC GAG TAC AAG CCC ACG GTG CGC CTC 

Met Thr Glu Tyr Lys Pro Thr Val Arg Leu 
15 1 5 10 



GCC ACC CGC GAC GAC GTC CCC CGG GCC GTA CGC ACC CTC GCC GCC GCG 
Ala Thr Arg Asp Asp Val Pro Arg Ala Val Arg Thr Leu Ala Ala Ala 

15 20 



25 



30 



CTC GAC ATC GGC AAG GTG TGG GTC GCG GAC GAC GGC GCC GCG GTG GCG 
Leu Asp lie Gly Lys Val Tr P Val Ala Asp As P Gly Ala tit tit 



^ S 5 52 £ S S £ SS E S 5 S SJ S S SI 

35 s s s s s s s: ™ = - ™ S 2 s « 2 



100 



105 



40 



CAA CAG ATG GAA GGC CTC CTG GCr rrr n*r* 

«. Gl „ „ t ox My Leu ™ s s £ ss ^ e s: m s 



115 



120 



« 5 S S E S SS SS S 2 S £ SS 2 £ E £ 

130 135 
CTG GGC AGC GCC GTC GTG CTC CCC GGA GTG GAG GCG GCC GAG CGC GCC 

50 Leu S; ser Ala val val ^ pro Gly val Giu ai * ^ S 9 C S 

145 150 



5137 
5197 
5257 
5309 



S357 



5405 



20 21 til T C ^ CGC CAC ACC GTC CCG GAC CGC CAC 

Phe Ala Asp Tyr Pro Ala Thr Arg His Thr Val Asp Pro Asp Arg £s 

30 35 40 

ATC GAG CGG GTC ACC GAG CTG CAA GAA CTC TTC CTC ACr rr-~ ^ „„ 

25 n. olu ^ Val iar Glu Leu G1 „ Glu ^ ™ C£ CTC OGC 

50 55 



5501 



5549 



5597 



5645 



5693 



5741 
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GGG GTG CCC GCC TTC CTG GAG ACC TCC GCG CCC CGC AAC CTC CCC TTC 578 9 
Gly Val Pro Ala Phe Leu Glu Thr Ser Ala Pro Arg Asn Leu Pro Phe 
155 160 165 170 

5 TAC GAG CGG CTC GGC TTC ACC GTC ACC GCC GAC GTC GAG TGC CCG AAG 5837 
Tyr Glu Arg Leu Gly Phe Thr Val Thr Ala Asp Val Glu Cys Pro Lys 
175 180 185 

GAC CGC GCG ACC TGG TGC ATG ACC CGC AAG CCC GGT GCC TGACGCCCGC 5886 
10 Asp Arg Ala Thr Trp Cys Met Thr Arg Lys Pro Gly Ala 
190 195 

CCCACGACCC GCAGCGCCCG ACCGAAAGGA GCGCACGACC CCATGAGCTT CGATCCAGAC 5 946 

15 ATGATAAGAT ACATTGATGA GTTTGGACAA ACCACAACTA GAATGCAGTG AAAAAAATGC 6 006 

TTTATTTGTG AAATTTGTGA TGCTATTGCT TTATTTGTAA C C ATT AT AAG CTG CAATAAA 6 06 6 

CAAGTTAACA ACAACAATTG CATTCATTTT ATGTTTCAGG TTCAGGGGGA GGTGTGGGAG 6126 

20 

GTTTTTTAAA G CAAGTAAAA CCTCTACAAA TGTGGTATGG C TG ATT ATG A TCCTGCCTCG 6186 

CGCGTTTCGG TGATGACGGT GAAAACCTCT GAC AC ATG C A G CT C C CGG AG ACGGTCACAG 6 24 6 

25 CTTGTCTGTA AG CGG ATG CC GGG AG C AG AC AAGCCCGTCA GGGCGCGTCA GCGGGTGTTG 6 3 06 

GCGGGTGTCG GGGCGCAGCC ATGACCCAGT CACGTAGCGA T AG CGGAGTG TATACTGG CT 6 366 

TAACTATGCG GCATCAGAGC AGATTGTACT GAGAGTGCAC CATATGTCGG GCCGCGTTGC 64 2 6 

30 

TGGCGTTTTT C CAT AG G CTC CGCCCCCCTG ACGAGCATCA CAAAAATCGA CGCTCAAGTC 64 86 

AG AGGTGG CG AAAC CCG AC A GGACTATAAA GATACCAGGC GTTTCCCCCT GG AAG CT CCC 6546 

35 TCGTGCGCTC TCCTGTTCCG ACCCTGCCGC TTAC CGG ATA CCTGTCCGCC TTTCTCCCTT 6 6 06 

CGGGAAGCGT GGCGCTTTCT CATAGCTCAC GCTGTAGGTA TCTCAGTTCG GTGTAGGTCG 6666 

TTCGCTCCAA GCTGGGCTGT GTG CACGAAC CCCCCGTTCA GCCCGACCGC TGCGCCTTAT 6 726 

40 

CCGGTAACTA TCGTCTTGAG TCCAACCCGG TAAGACACGA CTTATCGCCA CTGGCAGCAG 6 786 

CCACTGGTAA CAGGATTAGC AGAGCGAGGT ATGTAGGCGG TG CTAC AG AG TTCTTGAAGT 6 84 6 

45 GGTGGCCTAA CTACGG CTAC ACTAGAAGGA CAGTATTTGG TATCTGCGCT CTG CTG AAG C 6 906 

CAGTTACCTT CGGAAAAAGA GTTGGTAGCT CTTGATCCGG CAAACAAACC ACCGCTGGTA 6 96 6 

GCGGTGGTTT TTTTGTTTG C AAG CAG C AG A TTACGCGCAG AAAAAAAGGA TCTCAAGAAG 7 02 6 

50 

ATCCTTTGAT CTTTTCTACG GGGTCTGACG CTCAGTGGAA CGAAAACTCA CGTTAAGGGA 7086 

TTTTG GTCAT GAGATTATCA AAAAGGATCT TCACCTAGAT CCTTTTAAAT TAAAAATGAA 714 6 
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GTTTTAAATC 


AATCTAAAGT 


ATATATGAGT 


AAACTTGGTC 


' TGACAGTTAC 


■ CAATG CTTAA 


7206 


5 


TCAGTGAGGC 


ACCTATCTCA 


GCGATCTGTC 


TATTTCGTTC 


ATCCATAGTT 


' GCCTGACTCC 


7266 




CCGTCGTGTA 


GATAACTACG 


ATACGGGAGG 


GCTTACCATC 


TGGCCCCAGT 


GCTGCAATGA 


7326 




TACCGCGAGA 


CCCACGCTCA 


CCGGCTCCAG 


ATTTAT C AG C 


AATAAACCAG 


CCAGCCGGAA 


7386 


10 


GGG CCGAGCG 


CAGAAGTGGT 


CCTGCAACTT 


TATCCGCCTC 


CATCCAGTCT 


ATTAATTGTT 


7446 




GCCGGGAAGC 


TAGAGTAAGT 


AGTTCGCCAG 


TTAATAGTGC 


GCAACGTTGT 


TGCCATTGCT 


7506 


15 


ACAGG CATCG 


TGGTGTCACG 


CTCGTCGTTT 


GGTATGG CTT 


CATTCAGCTC 


CGGTTCCCAA 


7566 




CGATCAAGGC 


GAGTTACATG 


ATCCCCCATG 


TTGTG CAAAA 


AAGCGGTTAG 


CTCCTTCGGT 


7626 




CCTCCGATCG 


TTGTCAGAAG 


TAAGTTGGCC 


GCAGTGTTAT 


CACTCATGGT 


TATGGCAGCA 


76B6 


20 


CTGCATAATT 


CTCTTACTGT 


CATGCCATCC 


GTAAGATGCT 


TTT CTGTG AC 


TGGTGAGTAC 


7746 




TCAACCAAGT 


CATTCTGAGA 


ATAGTGTATG 


CGGCGACCGA 


GTTG CTCTTG 


CCCGGCGTCA 


7806 


25 


ACACGGGATA 


ATACCGCGCC 


AC ATAG C AG A 


ACTTTAAAAG 


TGCTCATCAT 


TGGAAAACGT 


7866 




TCTTCGGGGC 


GAAAACTCTC 


AAGGATCTTA 


CCGCTGTTGA 


GATCCAGTTC 


GATGTAACCC 


7926 




ACTCGTGCAC 


CCAACTGATC 


TTCAGCATCT 


TTTACTTTCA 


CCAGCGTTTC 


TGGGTGAGCA 


7986 


30 


AAAACAGGAA 


GGCAAAATGC 


CG CAAAAAAG 


GGAATAAGGG 


CGACACGGAA 


ATGTTGAATA 


8046 




CTCATACTCT 


TCCTTTTTCA 


ATATTATTGA 


AG CATTTATC 


AGGGTTATTG 


TCTCATGAGC 


8106 


35 


GGATACATAT 


TTGAATGTAT 


TTAGAAAAAT 


AAACAAATAG 


GGGTTCCGCG 


CACATTTCCC 


8166 




CGAAAAGTGC 


CACCTGACGT 


CTAAGAAACC 


ATTATTATCA 


TGACATTAAC 


CTATAAAAAT 


8226 




AGGCGTATCA 


CGAGGCCCTT 


TCGTCTTCAA 


GAATTGGTCG 


ATCGACCAAT 


TCTCATGTTT 


82B6 


40 


G AC AG CTT AT 


CA 










8298 



(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Trp Ser Cys He He Leu Phe Leu Val Ala Thr Ala 
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(2) INFORMATION FOR SEQ ID NO : 3: 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 109 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

10 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Val His Ser Asp He Gin Met Thr Gin Ser Pro Ser Ser Leu Ser Ala 
15 1 5 10 15 

Ser Val Gly Asp Arg Val Thr He Thr Cys Ser Ala Ser Ser Ser Val 
20 25 30 

20 Thr Tyr Met Tyr Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu 
35 40 45 

Leu He Tyr Asp Thr Ser Asn Leu Ala Ser Gly Val Pro Ser Arg Phe 
50 55 60 

25 

Ser Gly Ser Gly Ser Gly Thr Asp Tyr Thr Phe Thr He Ser Ser Leu 
65 70 75 80 

Gin Pro Glu Asp He Ala Thr Tyr Tyr Cys Gin Gin Trp Ser Ser His 
30 85 90 95 

He Phe Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys 
100 105 

35 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 106 amino acids 
40 (B) TYPE: amino acid 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

45 

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 
15 10 15 

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 
50 20 25 30 

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 
35 40 45 
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Gly Asn Ser Gin Glu Ser Val Thr Glu <Sln Asp Ser Lys Asp Ser Thr 
50 55 60 

5 Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 
65 70 75 80 

His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 

10 

Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 
100 105 



15 (2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 5: 



25 Met Asp Trp Thr Trp Arg Val Phe Cys Leu 
1 5 10 



Leu Ala Val Ala Pro Gl> 

15 



Ala His Ser Gin Val Gin Leu Val Gin Ser Gly Ala Glu Val Lys Lys 

30 

Pro Gly Ala Ser Val Lys Val Ser Cys Lys Ala Ser Gly Tyr Thr Phe 
35 40 45 

Thr Ser His Trp Met His Trp Val Arg Gin Ala Pro Gly Gin Gly Leu 
JJ 50 55 



60 



Glu Trp He Gly Glu Phe Asn P 
65 70 75 

40 Glu Lys Phe Lys Ser Lys Ala Thr Met Thr Val Asp Thr Ser Thr Asn 



ro Ser Asn Gly Arg Thr Asn Tyr Asn 

80 



85 90 



95 



Thr Ala Tyr Met Glu Leu Ser Ser Leu Arg Ser Glu Asp Thr Ala Val 
45 100 105 110 

Tyr Tyr Cys Ala Ser Arg Asp Tyr Asp Tyr Asp Gly Arg Tyr Phe Asp 
115 120 125 



50 



Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser Gly Glu Trp He 
130 14Q 

Leu Cys Ala Trp Ala Gin Leu Cys Pro Thr Pro Arg Ser His Gly Thr 

145 150 155 160 
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Thr Ser Leu Ala Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala 
165 170 175 

5 Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu 
180 185 190 

Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly 
195 200 205 

10 

Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser 
210 215 220 

Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu 
15 225 230 235 240 

Gly Thr Gin Thr Tyr lie Cys Asn Val Asn His Lys Pro Ser Asn Thr 
245 250 255 

20 Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Asp Lys Thr His Thr 
260 265 270 

Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser Val Phe 
275 280 285 

25 

Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met lie Ser Arg Thr Pro 
290 295 300 

Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp Pro Glu Val 
30 305 310 315 320 

Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala Lys Thr 
325 330 335 

35 Lys Pro Arg Glu Glu Gin Tyr Asn Ser Thr Tyr Arg Val Val Ser Val 
340 345 350 

Leu Thr Val Leu His Gin Asp Trp Leu Asn Gly Lys Glu Tyr Lys Cys 
355 360 365 

40 

Lys Val Ser Asn Lys Ala Leu Pro Ala Pro lie Glu Lys Thr lie Ser 
370 375 380 

Lys Ala Lys Gly Gin Pro Arg Glu Pro Gin Val Tyr Thr Leu Pro Pro 
45 385 390 395 400 

Ser Arg Asp Glu Leu Thr Lys Asn Gin Val Ser Leu Thr Cys Leu Val 
405 410 415 

50 Lys Gly Phe Tyr Pro Ser Asp lie Ala Val Glu Trp. Glu Ser Asn Gly 
420 425 430 
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Gin Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp fier 

440 445 



s «y .„ Phe Ph. ,eu Tyr .« Lys Leu Thr Val Asp Lys ^ 

460 

Gin Gin Gly Asn Val Phe Ser CyS Ser Val „. t H is Glu Ala Leu His 

4 ' U 47C; 

* /b 480 
10 Asn His Tyr Thr Gin Lys Ser Leu Ser Leu Ser Pro Gly Lys Met Val 



490 



495 

Arg Ser Ser Ser Arg Thr Pro Ser Asp Lys Pro Val Ala His Val y-1 



15 505 sio 



Ala Asn Pro Gin Ala Glu Gly Gin 
51 5 520 



y Gln Trp LSU Asn Ar 9 Ar 9 Ala 

525 



20 530 ^ G1U LSU *** As P A»« Gin Leu Val 

535 540 

Val Pro Ser Glu Gly Leu Tyr Leu He Tyr Ser Gin Val Leu Phe Lys 

550 555 560 

25 Gly Gin Gly Cys Pro Ser Thr His Val Leu Leu Thr His Thr lie Ser 

5 " 570 575 

Arg He Ala Val Ser Tyr Gin Thr Lys Val Asn Leu Leu Ser Ala He 

O OK) Coc 

30 590 

Lys ser Pro Cys Gin Arg Glu Thr Pro Glu Gly Ala Glu Ala Lys Pro 



600 



605 



35 ^ To ^ ^ Ts ^ - U ^ Oly 

"■^ 620 



Asp Arg Leu Ser Ala Glu II 
625 630 

40 Glu Ser Gly Gln Val Tyr Phe Gly n e lie Ala Leu 



^^^j y xjtiu £>er Ala Glu He Asn Am r>>~~ * ™ 

625 630 ^ 9 Pr ° Asp T ^ r Leu As P Phe Ala 

635 640 



645 650 



(2) INFORMATION FOR SEQ ID NO ■ 6- 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 
^ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID 
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Met Thr Glu Tyr Lys Pro Thr Val Arg Leu Ala Thr Arg Asp Asp Val 
15 10 15 

Pro Arg Ala Val Arg Thr Leu Ala Ala Ala Phe Ala Asp Tyr Pro Ala 
5 20 25 30 

Thr Arg His Thr Val Asp Pro Asp Arg His lie Glu Arg Val Thr Glu 
35 40 45 

10 Leu Gin Glu Leu Phe Leu Thr Arg Val Gly Leu Asp lie Gly Lys Val 
50 55 60 

Trp Val Ala Asp Asp Gly Ala Ala Val Ala Val Trp Thr Thr Pro Glu 
65 70 75 80 

15 

Ser Val Glu Ala Gly Ala Val Phe Ala Glu lie Gly Pro Arg Met Ala 

85 90 95 

Glu Leu Ser Gly Ser Arg Leu Ala Ala Gin Gin Gin Met Glu Gly Leu 
20 100 105 110 

Leu Ala Pro His Arg Pro Lys Glu Pro Ala Trp Phe Leu Ala Thr Val 
115 120 125 

25 Gly Val Ser Pro Asp His Gin Gly Lys Gly Leu Gly Ser Ala Val Val 
130 135 140 

Leu Pro Gly Val Glu Ala Ala Glu Arg Ala Gly Val Pro Ala Phe Leu 
145 150 155 160 

30 

Glu Thr Ser Ala Pro Arg Asn Leu Pro Phe Tyr Glu Arg Leu Gly Phe 
165 170 175 

Thr Val Thr Ala Asp Val Glu Cys Pro Lys Asp Arg Ala Thr Trp Cys 
35 180 185 190 



Met Thr Arg Lys Pro Gly Ala 
195 
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10 



15 



25 



50 

Patent Claims 

1 • Oligocistronic expression vector suitable for the production of a heteromeric 
protein consisting of at least two protein chains in a mammalian host cell 
comprising 

(i) a promoter / enhancer sequence, 

(ii) a sequence encoding a first chain of the heteromeric protein or a 
fragment thereof, 

(iii) a sequence encoding a second chain of the heteromeric protein or a 
fragment thereof, 

(iv) optionally a sequence encoding a third or further chain of the 
heteromeric protein or a fragment thereof, 

(v) a sequence encoding a selection marker, and 

(vi) at least two sequences comprising a 5'-UTR poliovirus sequence 
containing an IRES element. 



2. Expression vector according to claim 1, wherein the sequences (i) to (vi) are in 
the following order from upstream to downstream progression of said vector 
20 construct: 

( 1 ) a sequence comprising the promoter / enhancer sequence (i), 

(2) sequence comprising the sequence encoding a first chain of the 
heteromeric protein or a fragment thereof (ii), 

(3) a sequence (vi) comprising a first IRES element, 

(4) a sequence comprising the sequence encoding a second chain of the 
heteromeric protein or a fragment thereof (iii), 

(5) a sequence (vi) comprising a second IRES element. 

(6) optionally a sequence comprising the sequence encoding a third or 
chain of the heteromeric protein or a fragment thereof (iv), and a 
sequence comprising a third or further IRES element (vi) located behind 



30 
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the third or further sequence encoding the corresponding chain, 
(7) a sequence comprising the selection marker (v). 

5 3. Tricistronic expression vector according to claim 1 or 2 (comprising two IRES 

elements) wherein the sequence (ii) encodes the light chain and the sequence 
(iii) comprises a sequence encoding the heavy chain of a monoclaonal antibody 
(iiia), and sequences (iv) are not present. 

1 0 4. Tricistronic expression vector according to claim 3, wherein the sequence (iii) 
comprises besides sequence (iiia) a sequence (iiib) encoding a biologically 
active ligand in order to produce an antibody fusion protein. 

5. Expression vector according to claims 3 to 4 wherein the sequence (iiia) is 

1 5 shortend at its C-terminus and the sequence (iiib) is shortened at its N-terminus 

by a number of nucleotides each coding for 1 to 20 amino acids. 

6. Expression vector according to claims 3 to 5, wherein a sequence (iiib) is used 
encoding a cytokine or chemokine, 

20 

7. Expression vector according to claim 6, wherein a sequence (iiib) is used 
encoding TNF alpha or IL-2. 

8. Expression vector according to claim 1 to 7, wherein sequences (ii) and (iii) 
25 encoding the light and heavy chain of a monoclonal anti-EGFR antibody are 

used. 

9. Expression vector according to claim 8 comprising the sequences encoding 
humanized monoclonal antibody 425 (mAb425). 
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1 ©.Expression vector according to claim 3 comprising the CMV/MPSV 

promoter/enhancer sequence followed by the sequence encoding the mAb 425 
hght chain, followed by the sequence from 5' UTR poliovirus containing an 
IRES element, followed by a fusion gene encoding a fusion protein consisting 
of the heavy chain of humanized mAb425 and fused at its C-terminus the 
sequence encoding TOP alpha or IL-2, followed by another IRES element from 
5' UTR poliovirus, followed by a sequence coding for puromycin acetyl 
transferase and, finally the sequence of the polyadenylation signa. of SV40. 

1 LExpression vector according to claim ,0 comprising the DNA sequence wh.ch 
codes for the amino acid each depicted in Fig. 15. 

12.Expression vector according to claims 1 to 10, comprising, additional, y, two 

1 -> SAR elements. 



^.Expression system comprising a mammalian host cell transformed with an 
expression vector specified in one of the claims 1 to 12. 

20 1 ^Expression system according to claim 1 3, wherein the host cell is CI lO BHK 
21orSP2/0. 



1 5.Process for the production of a heteomeric protein or fragments thereof by 
cultivating the host cells of an expression system specified in claim 13 in a 
suitable nutrient and separating the complete and active heteomeric protein 
from the cells and / or the medium. 

EProcess according to claim 15 for the production of mAb425/TNF-alpha or 
mAb425/U-2 Antibody fusion proteins or fragments thereof 

30 
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FIG. 1 A 
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FIG. 1 B 
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FIG. 1 C 
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FIG. 1 D 
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FIG. 1 E 
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FIG. 1 F 
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FIG. 1 G 
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FIG. 3 
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FIG. 4 
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FIG. 5 
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FIG. 6 
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TCGATAATGA AAGACCCCAC CTGTAGGTTT GGCAAGCTAG CTTAAGTAAC GCCATTTTGC 
AAGGCATGGG AAAAATACAT AACTGAGAAT AGAGAAGTTC AGATCAAGGT CAGGAACAGA 120 
GA A ACAGG AG A ATATGGGCC AA ACAGGATA TCTGTGGTAA GCAGTTCCTG CCCCGCTCAG 1 80 
GGCCAAGAAC AGTTGGAACA GGAGAATTGG GCCAAACAGG ATATCTGTGG TAAGCAGTTC 240 
CTGCCCCGCT CAGGGCCAAG AACAGATGGT CCCCAGATGC GGTCCCGCCC TCAGCAGTTT 300 
CTAGACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT 360 
G A CGTC A A T A ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA 420 
ATGGGTGGAG TATTTACGGT AAACTGCCCA CTTGGCAGTA CATCA AGTGT ATCATATGCC 480 
AAGTACGCCC CCTATTGACG TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA 540 
CATGACCTTA TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG 660 
ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAA ATCAACG 720 
GGACTTTCCA A A ATGTCGTA ACAACTCCGC CCCATTGACG CAAATGGGCG GTAGGCGTGT 780 
ACGGTGGGAG GTCTATATAA GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG 840 
CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCGAGGAACT 900 
GGAAAACCAG AAAGTTAACT GGTAAGTTTA GTCTTTTTGT CTTTT A TTTC AGGTCCCGGA 960 

ATTAAGCTTC GCCACC ATG GGA TGG AGC TGT ATC ATC CTC TTC TTG GTA 1009 

Met GlY Try Ser CVs lie h<- l C u Ph* I .p„ v„i 

GCA ACA OCT AC AGGTAAGGGG CTCACAGTAG CAGGCTTGAG GTCTGGACAT 1060 
Ala Thr Ala 

ATATATGGGT GACAATGACA TCCACTTTGC CTTTCTCTCC ACAGGT GTC CAC TCC 1)15 

Val His Ser 

GAC ATC CAG ATG ACC CAG AGC CCA AGC AGC CTG AGC GCC AGC GTG GGT 1 163 
Asp He Gin Met Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly 

GAC AGA GTG ACC ATC ACC TGT AGT GCC AGC TCA ACT GTA ACT TAC ATG 1211 
Asp Arg Val Thr lie Thr Cys Ser Ala Ser Ser Ser Vol Thr Tyr Met 

TAT TGG TAC CAG CAG AAG CCA GGT AAG GCT CCA AAG CTG CTG ATC TAC 1 259 
Tyr Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr 
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GAC ACA TCC AAC CTG GCT TCT GGT GTG CCA AGC AG A TTC AGC GGT AGC 1307 
Asp Thr Ser Asn Leu Ala Ser Gly Val Pro Ser Arg Phe Ser Gly Ser 

GGT AGC GGT ACC GAC TAC ACC TTC ACC ATC AGC AGC CTC CAG CCA GAG 1355 
Gly Ser Gly Thr Asp Tyr Thr Phe Thr lie Ser Ser Leu Gin Pro Glu 

GAC ATC GCC ACC TAC TAC TGC CAG CAG TGG ACT ACT CAC ATA TTC ACG 1403 
Asp He Ala Thr Tyr Tyr Cys Gin Gin Trp Ser Ser His lie Phe Thr 

TTC GGC CAA GGG ACC A AG GTG GAA ATC AAA CGTGAGTAGA ATTTAAACTT 1453 
Phe Gly Gin Gly Thr Lys Val Glu lie Lys 

TGCTTCCTCA GTTGGATCCA TCTGGG ATAA GCATGCTGTT TTCTGTCTGT CCCTAACATG 1513 

CCCTGTGATT ATGCGCAA AC AACACACCCA AGGGCAG AAC TTTGTTACTT A AACACCATC 1 573 

CTGTTTGCTT CTTTCCTCAG G A ACT GTG GCT GCA CCA TCT GTC TTC ATC TTC ! 625 

Thr Val Ala Ala Pro Ser Val Phe He Phe 

CCG CCA TCT GAT GAG CAG TTG AAA TCT GG A ACT GCC TCT GTT GTG TGC 1 673 
Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala Ser Val Val Cys 

CTG CTG AAT AAC TTC TAT CCC AGA GAG GCC AAA GTA CAG TGG A AG GTG 1 72 1 
Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val Gin Trp Lys Val 

GAT AAC GCC CTC CAA TCG GGT AAC TCC CAG GAG AGT GTC ACA GAG CAG 1 769 
Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser Val Thr Glu Gin 

GAC AGC A AG GAC AGC ACC TAC AGC CTC AGC AGC ACC CTG ACG CTG AGC 1817 
Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser 

AAA GCA GAC TAC GAG AAA CAC AAA GTC TAC GCC TGC GAA GTC ACC CAT 1 865 
Lys Ala Asp Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val Thr His 

CAG GGC CTG AGC TCG CCC GTC ACA A AG AGC TTC AAC AGG GG A G AG TGT 1913 
Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 

TAGAATTCAGCTT TTAAAACAGC TCTGGGGTTG TACCCACCCC AGAGGCCCAC 1966 

GTGGCGGCTA GTACTCCGGT ATTGCGGTAC CCTTGTACGC CTGTTTTATA CTCCCTTCCC 2026 

GTAACTTAGA CGCACAAAAC CAAGTTCAAT AGAAGGGGGT ACAAACCAGT ACCACCACGA 2086 

ACAAGCACTT CTGTTTCCCC GGTGATGTCG TATAGACTGC TTGCGTGGTT GAAAGCGACG 2 146 

G A TCCGTTA T CCGCTTATGT ACTTCGAG AA GCCCAGTACC ACCTCGGA AT CTTCGATGCG 2206 

TTGCGCTCAG CACTCAACCC CAGAGTGTAG CTTAGGCTGA TGAGTCTGGA CATCCCTCAC 2266 

CGGTGACGGT GGTCCAGGCT GCGTTGGCGG CCTACCTATG GCTAACGCCA TGGGACGCTA 2326 

GTTGTGAACA AGGTGTGAAG AGCCTATTGA GCTACATAAG AATCCTCCGG CCCCTGAATG 2386 
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CGGCTA ATCC CAACCTCGGA GCAGGTGGTC ACAAACCAGT GATTGGCCTG TCGTAACGCG 2446 
CAAGTCCGTG GCGGAACCGA CTACTTTGGG TGTCCGTGTT TCCTTTTATT TTATTGTGGC 2506 
TGCTTATGGT GACAATCACA GATTGTTATC ATAAAGCGAA TTGGATTGCG GCCGCGAATT 2566 



AAGCTTGCCG CCACC ATG G AC TGG ACC TGG CGC GTG TTT TGC CTG CTC GCC 26 1 7 

Met Asp Tm Thr Trp Arg Val Phe Cvs Leu Leu Ala 

GTG GCT CCT GGG GCC CAC AGC CAG GTG CAA CTA GTG CAG TCC GGC GCC 2665 
Val Ala Pro Glv Ala His Ser Gin Val Gin Leu Val Gin Ser Gly Ala 

G AA GTG A AG AAA CCC GGT GCT TCC GTG AAG GTG AGC TGT AAA GCT AGC 27 1 3 
Glu Val Lys Lys Pro Gly Ala Ser Val Lys Val Ser Cys Lys Ala Ser 

GGT TAT ACC TTC ACA TCC CAC TGG ATG CAT TGG GTT AG A CAG GCC CCA 276 1 
Gly Tyr Thr Phe Thr Ser His Trp Met His Trp Val Arg Gin Ala Pro 

GGC CAA GGG CTC GAG TGG ATT GGC GAG TTC AAC CCT TCA AAT GGC CGG 2809 
Gly Gin Gly Leu Glu Trp He Gly Glu Phe Asn Pro Ser Asn Gly Arg 

ACA AAT TAT AAC GAG AAG TTT AAG AGC AAG GCT ACC ATG ACC GTG GAC 2857 
Thr Asn Tyr Asn Glu Lys Phe Lys Ser Lys Ala Thr Met Thr Val Asp 

ACC TCT ACA AAC ACC GCC TAC ATG GAA CTG TCC AGC CTG CGC TCC GAG 2905 
Thr Ser Thr Asn Thr Ala Tyr Met Glu Leu Ser Ser Leu Arg Ser Glu 

GAC ACT GCA GTC TAC TAC TGC GCC TCA CGG GAT TAC GAT TAC GAT GGC 2953 
Asp Thr Ala Val Tyr Tyr Cys Ala Ser Arg Asp Tyr Asp Tyr Asp Gly 

AGA TAC TTC GAC TAT TGG GGA CAG GGT ACC CTT GTC ACC GTC AGT TCA 300 1 
Arg Tyr Phe Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser 

GGT GAG TGG ATC CTC TGC GCC TGG GCC CAG CTC TGT CCC ACA CCG CGG 3049 
Gly Glu Trp He Leu Cys Ala Trp Ala Gin Leu Cys Pro Thr Pro Arg 

TCA CAT GGC ACC ACC TCT CTT GCA GCC TCC ACC AAG GGC CCA TCG GTC 3097 
Ser His Gly Thr Thr Ser Leu Ala Ala Ser Thr Lys Gly Pro Ser Val 

TTC CCC CTG GCA CCC TCC TCC AAG AGC ACC TCT GGG GGC ACA GCG GCC 3145 
Phe Pro Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala 

CTG GGC TGC CTG GTC AAG GAC TAC TTC CCC GAA CCG GTG ACG GTG TCG 3 1 93 
Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser 

TGG AAC TCA GGC GCC CTG ACC AGC GGC GTG CAC ACC TTC CCG GCT GTC 324 1 
Trp Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val 

CTA CAG TCC TCA GGA CTC TAC TCC CTC AGC AGC GTG GTG ACC GTG CCC 3289 
Leu Gin Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro 
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TCC AGC AGC TTG GGC ACC CAG ACC TAC ATC TGC AAC GTG AAT CAC AAG 3337 
Ser Ser Ser Leu Gly Thr Gin Thr Tyr He Cys Asn Val Asn His Lys 

CCC AGC AAC ACC AAG GTG GAC AAG AAA GTT GAG CCC AAA TCT TGT GAC 3385 
Pro Ser Asn Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Asp 

AAA ACT CAC ACA TGC CCA CCG TGC CCA GCA CCT GA A CTC CTG GGG GGA 3433 
Lys Thr His Thr Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly 

CCG TCA GTC TTC CTC TTC CCC CCA AAA CCC AAG GAC ACC CTC ATG ATC 348 1 
Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met lie 

TCC CGG ACC CCT GAG GTC ACA TGC GTG GTG GTG GAC GTG AGC CAC GAA 3529 
Ser Arg Thr Pro Glu Val Thr Cys Vai Val Val Asp Val Ser His Glu 

GAC CCT GAG GTC AAG TTC AAC TGG TAC GTG GAC GGC GTG GAG GTG CAT 3577 
Asp Pro Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His 

AAT GCC AAG ACA AAG CCG CGG GAG GAG CAG TAC AAC AGC ACG TAC CGG 3625 
Asn Ala Lys Thr Lys Pro Arg Glu Glu Gin Tyr Asn Ser Thr Tyr Arg 

GTG GTC AGC GTC CTC ACC GTC CTG CAC CAG GAC TGG CTG AAT GGC AAG 3673 
Val Val Ser Val Leu Thr Val Leu His Gin Asp Trp Leu Asn Gly Lys 

GAG TAC AAG TGC AAG GTC TCC AAC AAA GCC CTC CCA GCC CCC ATC GAG 372 1 
Glu Tyr Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro He Glu 

AAA ACC ATC TCC AAA GCC AAA GGG CAG CCC CGA GAA CCA CAG GTG TAC 3769 
Lys Thr He Ser Lys Ala Lys Gly Gin Pro Arg Glu Pro Gin Val Tyr 

ACC CTG CCC CCA TCC CGG GAT GAG CTG ACC AAG AAC CAG GTC AGC CTG 3817 
Thr Leu Pro Pro Scr Arg Asp Glu Leu Thr Lys Asn Gin Val Ser Leu 

ACC TGC CTG GTC AAA GGC TTC TAT CCC AGC GAC ATC GCC GTG GAG TGG 3865 
Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp He Ala Vai Glu Trp 

GAG AGC AAT GGG CAG CCG GAG AAC AAC TAC AAG ACC ACG CCT CCC GTG 3913 
Glu Scr Asn Gly Gin Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val 

CTG GAC TCC GAC GGC TCC TTC TTC CTC TAC AGC AAG CTC ACC GTG GAC 3961 
Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp 

AAG AGC AGG TGG CAG CAG GGG AAC GTC TTC TCA TGC TCC GTG ATG CAT 4009 
Lys Ser Arg Trp Gin Gin Gly Asn Val Phe Ser Cys Scr Val Met His 

GAG GCT CTG CAC AAC CAC TAC ACG CAG AAG AGC CTC TCC CTG TCT CCG 4057 
Glu Ala Leu His Asn His Tyr Thr Gin Lys Scr Leu Ser Leu Ser Pro 

GGT AAA ATG GTC AGA TCA TCT TCG CGA ACC CCG AGT GAC AAG CCT GTA 4105 
Gly Lys M&i Val Arg Ser Ser Ser Arg Thr Pro Ser Asp Lys Pro Val 

GCC CAT GTT GTA GCA AAC CCT CAA GCT GAG GGG CAA CTG CAG TGG CTG 4153 
Ala His Val Val Ala Asn Pro Gin Ala Glu Gly Gin Leu Gin Trp Leu 
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A AC CGC CGG GCC A AT GCC CTC CTG GCC A AT GGC GTC GAG CTG AG A GAT 4201 
Asn Arg Arg Ala Asn Ala Leu Leu Ala Asn Gly Val Glu Leu Arg Asp 

AAC CAG CTG GTG GTG CCA TCA GAG GGC CTG TAC CTC ATC TAC TCC CAG 4249 
Asn Gin Leu Val Val Pro Ser Glu Gly Leu Tyr Leu He Tyr Ser Gin 

GTC CTC TTC AAG GGC CAA GGC TGC CCG TCG ACC CAT GTG CTC CTC ACC 4297 
Val Leu Phc Lys Gly Gin Gly Cys Pro Ser Thr His Val Leu Leu Thr 

CAC ACC ATC AGC CGC ATC GCC GTC TCC TAC CAG ACC AAG GTT AAC CTC 4345 
His Thr He Scr Arg He Ala Val Ser Tyr Gin Thr Lys Val Asn Leu 

CTC TCT GCC ATC AAG AGC CCC TGC CAG AGG GAG ACC CCA GAG GGG GCT 4393 
Leu Ser Ala lie Lys Ser Pro Cys Gin Arg Glu Thr Pro Glu Gly Ala 

GAG GCC AAG CCC TGG TAT GAG CCC ATC TAT CTG GGA GGG GTC TTC CAG 444 1 
Glu Ala Lys Pro Trp Tyr Glu Pro He Tyr Leu Gly Gly Val Phe Gin 

CTC GAG AAG GGT GAC CGA CTC AGC GCT GAG ATC AAT CGG CCC GAC TAT 4489 
Leu Glu Lys Gly Asp Arg Leu Ser Ala Glu lie Asn Arg Pro Asp Tyr 

CTC GAC TTT GCC GAG TCC GGA CAG GTC TAC TTT GGG ATC ATT GCC CTG 4537 
Leu Asp Phe Ala Glu Ser Gly Gin Val Tyr Phe Gly He lie Ala Leu 

TGATAAGGATCCCCGG GTACCGAGCT CGA A TTC AGC TTTTAAAACA GCTCTGGGGT 4593 
TGTACCCACC CCAGAGGCCC ACGTGGCGGC TAGTACTCCG GTATTGCGGT ACCCTTGTAC 4653 
GCCTGTTTTA TACTCCCTTC CCGTAACTTA GACGCACA A A ACCAAGTTCA ATAGA AGGGG 47 1 3 
GTACA AACCA GTACCACCAC G AACAAGCAC TTCTGTTTCC CCGGTGATGT CGTATAGACT 4773 
GCTTGCGTGG TTGAAAGCGA CGGATCCGTT ATCCGCTTAT GTACTTCGAG A AGCCC AGTA 4833 
CCACCTCGG A ATCTTCGATG CGTTGCGCTC AGCACTCAAC CCCAGAGTGT AGCTTAGGCT 4893 
GATGAGTCTG GACATCCCTC ACCGGTGACG GTGGTCCAGG CTGCGTTGGC GGCCTACCTA 4953 
TGGCT A ACGC CATGGGACGC TAGTTGTGAA CAAGGTGTGA AGAGCCTATT GAGCTACATA 5013 
AGAATCCTCC GGCCCCTGAA TGCGGCTAAT CCCAACCTCG GAGCAGGTGG TCACA AACCA 5073 
GTGATTGGCC TGTCGTAACG CGCA AGTCCG TGGCGGA ACC GACTACTTTG GGTGTCCGTG 5 1 33 
TTTCCTTTTA TTTTATTGTG GCTGCTTATG GTGACA ATCA CAG ATTGTTA TCATA A AGCG 5 1 93 
AATTGGA7TG CGGCCGGCCG CCACGACCGG TGCCGCCACC ATCCCCTGAC CCACGCCCCT 5253 
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GACCCCTCAC AAGGAGACGA CCTTCC ATG ACC GAG TAC AAG CCC ACG GTG CGC 5306 

Met Thr Glu Tyr Lys Pro Thr Val Arg 

CTC GCC ACC CGC GAC GAC GTC CCC CGG GCC GTA CGC ACC CTC GCC GCC 5354 
Leu Ala Thr Arg Asp Asp Val Pro Arg Ala Val Arg Thr Leu Ala Ala 

GCG TTC GCC GAC TAC CCC GCC ACG CGC CAC ACC GTC GAC CCG GAC CGC 5402 
Ala Phe Ala Asp Tyr Pro Ala Thr Arg His Thr Val Asp Pro Asp Arg 

CAC ATC GAG CGG GTC ACC GAG CTG CAA GAA CTC TTC CTC ACG CGC GTC 5450 
His lie Glu Arg Val Thr Glu Leu Gin Glu Leu Phe Leu Thr Arg Val 

GGG CTC GAC ATC GGC AAG GTG TGG GTC GCG GAC GAC GGC GCC GCG GTG 5498 
Gly Leu Asp He Gly Lys Val Trp Val Ala Asp Asp Gly Ala Ala Val 

GCG GTC TGG ACC ACG CCG GAG AGC GTC GAA GCG GGG GCG GTG TTC GCC 5546 
Ala Val Trp Thr Thr Pro Glu Ser Val Glu Ala Gly Ala Val Phe Ala 

GAG ATC GGC CCG CGC ATG GCC GAG TTG AGC GGT TCC CGG CTG GCC GCG 5594 
Glu lie Gly Pro Arg Met Ala Glu Leu Ser Gly Ser Arg Leu Ala Ala 

CAG CAA CAG ATG GAA GGC CTC CTG GCG CCG CAC CGG CCC AAG GAG CCC 5642 
Gin Gin Gin Met Glu Gly Leu Leu Ala Pro His Arg Pro Lys Glu Pro 

GCG TGG TTC CTG GCC ACC GTC GGC GTC TCG CCC GAC CAC CAG GGC AAG 5690 
Ala Trp Phe Leu Ala Thr Val Gly Val Ser Pro Asp His Gin Gly Lys 

GGT CTG GGC AGC GCC GTC GTG CTC CCC GGA GTG GAG GCG GCC GAG CGC 5738 
Gly Leu Gly Ser Ala Val Val Leu Pro Gly Val Glu Ala Ala Glu Arg 

GCC GGG GTG CCC GCC TTC CTG GAG ACC TCC GCG CCC CGC AAC CTC CCC 5786 
Ala Gly Val Pro Ala Phe Leu Glu Thr Ser Ala Pro Arg Asn Leu Pro 

TTC TAC GAG CGG CTC GGC TTC ACC GTC ACC GCC GAC GTC GAG TGC CCG 5834 
Phe Tyr Glu Arg Leu Gly Phe Thr Val Thr Ala Asp Val Glu Cys Pro 

AAG GAC CGC GCG ACC TGG TGC ATG ACC CGC AAG CCC GGT GCC TGA 5879 
Lys Asp Arg Ala Thr Trp Cys Met Thr Arg Lys Pro Gly Ala 



CGCCCGCCCC ACGACCCGCA GCGCCCGACC GAAAGGAGCG CACGACCCCA TGAGCTTCGA 5939 
TCCAGACATG ATAAGATACA TTGATGAGTT TGGACAAACC ACAACTAGAA TGCAGTGAAA 5999 
AAAATGCTTT ATTTGTGAAA TTTGTGATGC TATTGCTTTA TTTGTAACCA TTATAAGCTG 6059 
CA ATAAACA A GTTAACAACA ACAATTGCAT TCATTTTATG TTTCAGGTTC AGGGGG AGGT 6119 
GTGGGAGGTT TTTT A A A G C A AGTAAAACCT CTACAAATGT GGTATGGCTG ATTATGATCC 61 79 
TGCCTCGCGC GTTTCGGTGA TGACGGTGAA AACCTCTGAC AC ATG C AGCT CCCGGAGACG 6239 
GTCACAGCTT GTCTGTAAGC GGATGCCGGG AGCAGACAAG CCCGTCAGGG CGCGTCAGCG 6299 
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GGTGTTGGCG GGTGTCGGGG CGCAGCCATG ACCCAGTCAC GTAGCGATAG CGGAGTGTAT 6359 
ACTGGCTTAA CTATGCGGCA TCAG AGCAG A TTGTACTGAG AGTGCACCAT ATGTCGGGCC 64 1 9 
GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC 6479 
TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA 6539 
AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT 6599 
CTCCCTTCGG GAAGCGTGGC G CTTTCTC A T AGCTCACGCT GTAGGTATCT CAGTTCGGTG 6659 
TAGGTCGTTC GCTCCA AGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC 67 1 9 
GCCTTATCCG GTAACTATCG TCTTGAGTCC A A CCCG GTA A GACACGACTT ATCGCCACTG 6779 
GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC 6839 
TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG 6899 
CTGAAGCCAG TTACCTTCGG AAA AAGAGTT GGTAGCTCTT G ATCCGGCA A ACAAACCACC 6959 
GCTGGTAGCG GTGGTTTTTT TGTTTGC A A G CAGCAGATTA CGCGCAGA AA AAAAGGATCT 70 1 9 
CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT 7079 
TAAGGGATTT TGGTCATG A G ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA 7139 
AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA 7199 
TGCTTAATCA GTGAGGCACC TATCTCAGCG ATCTGTCTAT TTCGTTC A TC CATAGTTGCC 7259 
TGACTCCCCG TCGTGTAGAT A ACTACG ATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT 73 1 9 
GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT TATCAGCAAT AAACCAGCCA 7379 
GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT CCGCCTCCAT CCAGTCTATT 7439 
AATTGTTGCC GGGAAGCTAG AGTAAGTAGT TCGCCAGTTA ATAGTGCGCA ACGTTGTTGC 7499 
CATTGCTACA GGCATCGTGG TGTCACGCTC GTCGTTTGGT ATGGCTTCAT TCAGCTCCGG 7559 
TTCCCA ACG A TCAAGGCG AG TTACATG ATC CCCCATGTTG TGCA A AA AAG CGGTTAGCTC 76 1 9 
CTTCGGTCCT CCGATCGTTG TCAGAAGTAA GTTGGCCGCA GTGTTATCAC TCATGGTTAT 7679 
GGCAGCACTG CATA ATTCTC TTACTGTCAT GCCATCCGTA AGATGCTTTT CTGTGACTGG 7739 
TGAGTACTCA ACCAAGTCAT TCTGAGAATA GTGTATGCGG CGACCGAGTT GCTCTTGCCC 7799 
GGCGTCAACA CGGGATAATA CCGCGCCACA TAGCAGAACT TTAAAAGTGC TCATCATTGG 7859 
AAAACGTTCT TCGGGGCGAA AACTCTCAAG GATCTTACCG CTGTTGAGAT CCAGTTCGAT 7919 
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GTAACCCACT CGTGCACCCA ACTG ATCTTC AGCATCTTTT ACTTTCACCA GCGTTTCTGG 7979 
GTGAGCAAAA ACAGGAAGGC AAAATGCCGC AAAAAAGGGA ATAAGGGCGA CACGGAAA 8039 
TTGAATACTC ATACTCTTCC TTTTTCAATA TTATTGAAGC ATTTATCAGG GTTATTGTCT 8099 
CATG AGCGG A TACATATTTG AATGTATTTA GAAAAATA A A CAA ATAGGGG TTCCGCGCAC 8 1 59 
ATTTCCCCGA AAAGTGCCAC CTGACGTCTA AGA A ACC ATT ATTATCATG A C ATTAACCTA 82 1 9 
TAAAAATAGG CGTATCACGA GGCCCTTTCG TCTTCAAGAA TTGGTCGATC GACCAATTCT 8279 
CATGTTTGAC AGCTTATCA 8298 
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FIG. 16 
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FIG. 17 
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FIG. 18 
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